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(57) Abstract 

The present invention defines an assay use- 
ful for screening libraries of synthetic or biological 
compounds for their ability to bind specific DNA 
test sequences. The assay is also useful for deter- 
mining the sequence specificity and relative DNA- 
binding affinity of DNA-binding molecules for 
any particular DNA sequence. The assay is a 
competition assay in which binding of a test mole- 
cule to a DNA test sequence changes the binding 
characteristics of a DNA-binding protein to its 
binding sequence. When such a test molecule 
binds the test sequence the equilibrium of the 
DNArprotein complexes is disturbed, generating 
changes-in-the ratio between unbound"DNA"and^ 
DNA:protein complexes. The assay is versatile in 
that any test sequence can be tested by placing the 
test sequence adjacent to a defined protein binding 
DNA screening sequence. 
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SCREENING ASSAY FOR THE DETECTION OF 
DNA-BINDING MOLECULES 

5 Field of the Invention 

The present invention relates to a method, a system, 
and a kit useful for the identification of molecules that 
specifically bind to defined nucleic acid sequences. 
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Background of the Invention 

Several classes of small molecules that interact with 
double-stranded DNA have been identified. Many of thes 
small molecules have profound biological effects. For 

10 example, many aminoacridines and polycyclic hydrocarbons 
bind DNA and are mutagenic, teratogenic, or carcinogenic. 
Other small molecules that bind DNA include: biological 
metabolites, some of which have applications as antibiotics 
and antitumor agents including actinomycin D, echinomycin, 

15 distamycin, and calichieamicin; planar dyes, such as 
ethidium and acridine orange; and molecules that contain 
heavy metals, such as cisplatin, a potent antitumor drug. 

Most known DNA-binding molecules do not have a known 
sequence binding preference. However, there are a few 

20 small DNA-binding molecules that preferentially recognize 
specific nucleotide sequences, for example: echinomycin 
preferentially binds the sequence [(A/T)C6T]/[ACG(A/T) ] 
(Gilbert et al.); cisplatin covalently cross-links a 
platinum molecule between the N7 atoms of two adjacent 

25 deoxyguanosines (Sherman et al.); and calicheamicin 
preferentially binds and cleaves the sequence TCCT/AGGA 

_ (Zein et al^.J . 

The biological response elicited by most therapeutic 

DNA-binding molecules is toxicity, specific only in that 

30 these molecules may preferentially affect cells that are 

more actively replicating or transcribing DNA than other 

cells. Targeting specific sites may significantly decreas 

toxicity simply by reducing the number of potential binding 

sites in th DNA. As specificity for longer sequenc s is 

35 acquired, th nonspecific toxic effects due to DNA-binding 
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may decrease. Many therapeutic DNA-binding molecules 
initially identified based on their therapeutic activity in 
a biological screen have been later determined to bind DNA. 
Therefore, there is a need for an in vitro assay 
5 useful to screen for DNA-binding molecules. There is also 
a need for an assay that allows the discrimination of 
sequence binding preferences of such molecules. 
Additionally, there is a need for an assay that allows the 
determination of the relative affinities of a DNA-binding 
10 molecule for different DNA sequences. Finally, there is a 
need for therapeutic molecules that bind to specific DNA 
sequences. 

summary of the Invention 

15 ThB present invention provides a method for screening 

molecules or compounds capable of binding to a selected 
test sequence in a duplex DNA. The method involves adding 
a molecule to be screened, or a mixture containing the 
molecule, to a test system. The test system includes a DNA 
20 binding protein that is effective to bind to a screening 
sequence, i.e. the DNA binding protein's cognate binding 
site, in a duplex DNA with a binding affinity that is 
preferably substantially independent of the sequences 
adjacent the binding sequence — these adjacent sequences 
25 are referred to as test sequences. But, the DNA binding 
protein is sensitive to binding of molecules to such test 
„sequence,-when-the test sequence_is_adjacent_the_ screening 
sequence. The test system further includes a duplex DNA 
having the screening and test sequences adjacent one 
30 another. Also, the binding protein is present in an amount 
that saturates the screening sequence in the duplex DNA. 
The test molecule is incubated in contact with the test 
system for a peri d suffici nt to permit binding of th 
molecule being tested to the test sequence in the duplex 
DNA. The amount of binding protein bound to the dupl x DNA 



35 
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is compared bef re and after the addition of the test 
molecule or mixture. 

Candidates for the screening sequence/binding protein 
may be selected from the following group: EBV origin f 
5 replication/EBNA, HSV origin of replication/UL9 , VZV origin 
of replication/UL9-like, HPV origin of replication/E2, 
inter leukin 2 enhancer /NFAT-1, HIV-LTR/NFAT-1, HIV- 
LTR/HFkB, HBV enhancer fibrinogen promoter /HNF-1/ 
lambda o^-Oj^/ cro, and essentially any other DNA: protein 

10 interactions . 

A preferred embodiment of the present invention 
utilizes the DL9 protein, or DNA-binding proteins derived 
therefrom, and its cognate binding sequence SEQ ID N0:1, 
SEQ ID NO: 2, SEQ ID NO: 17, or SEQ ID NO: 15. 

15 The test sequences can be any combination of sequences 

of interest. The sequences may be randomly generated for 
shot-gun approach screening or specific secpuences may be 
chosen. Some specific sequences of medical interest 
include the following sequences involved in DNA: protein 

20 interaqtions: EBV origin of replication, HSV origin f 
replication, VZV origin of replication, HPV origin of 
replication, interleukin 2 enhancer, HIV-LTR, HBV enhancer, 
and fibrinogen promoter. Furthermore, a set of assay test 
sequences comprised of all possible sequences of a given 

25 length could be tested (eg. , all four base pair sequences) . 

In the above method, comparison of protein-bound to 
free DNA can be accomplished using any detection assay, 
preferably, a gel band-shift assay, a filter-binding assay, 
or a capture/ detection assay. 

30 In one embodiment of the DNA capture/detection assay, 

in which the DNA that is not bound to protein is captured, 
the capture system involves the biotinylation of a 
nucl otide within the screening sequenc (i) that does not 
eliminate the prot in's ability t bind to the screening 
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10 



sequence, (ii) that is capable of binding streptavidin, and 
(iii) where the biotin moiety is protected from 
interactions with streptavidin when the protein is bound to 
the screening sequence. The capture/detection assay also 
involves the detection of the captured DNA. 

In another embodiment of the DNA capture/detection 
assay, the capture system in which the DNA:protein 
complexes are captured, the capture system involves the use 
of nitrocellulose filters under low salt conditions to 
capture the protein-bound DNA while allowing the non- 
protein-bound DNA to pass through the filter. 

The present invention also includes a screening system 
for identifying molecules that are capable of binding to a 
test sequence in a duplex DNA sequence. The system 
15 includes a DNA binding protein that is effective to bind to 
a screening sequence in a duplex DNA with a binding 
affinity that is substantially independent of a test- 
seguence adjacent the screening sequence. The binding of 
the DNA protein is, however, sensitive to binding of 
20 molecules to the test sequence when the test sequence is 
adjacent the screening sequence. The system includes a 
duplex DNA having the screening and test sequences adjacent 
one another. Typically, the binding protein is present in 
an amount that saturates the screening sequence in the 
25 duplex DNA. The system also includes means for detecting 
the amount of binding protein bound to the DNA. 

_ . AS -described above the_test -sequences can_be_ any 

number of sequences of interest. 

The screening sequence/ binding protein can be selected 
from known DNArprotein interactions using the criteria and 
guidance of the present disclosure. It can also be applied 
to DNA: Protein interactions later discovered. 

A preferred embodiment of the sere ning system of the 
present invention includes the UL9 protein, or DNA-binding 
protein derived therefrom (e.g., the truncated DL9 protein 



30 
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deslgnat d UL9-C00H) . In this embodiment the duplex DNA 
has (i) a screening sequence selected from the group 
consisting of SEQ ID N0:1, SEQ ID NO: 2, SEQ ID NO: 17 and 
SEQ ID NO: 15, and (ii) a test sequence adjacent th 
5 screening sequence, where UL9 is present in an amount that 
satvirates the screening sequence. The system further 
includes means for detecting the amount of DL9 bound to the 
DNA, including, band-shift assays, filter-binding assays, 
and capture/detection assays. 

10 The present disclosure describes the procedures needed 

to test DNA: protein interactions for their suitability for 
use in the screening assay of the present invention. 

The present invention further defines DNA captur 
systems and detection systems. Several methods are 

15 described. A filter binding assay can be used to captxire 
the DNA: protein complexes or, alternatively, the DNA not 
bound by protein can be captured by the following method. 
In the first part of this system, the cognate DNA binding 
site of the DNA binding protein is modified with a 

20 detection moiety, such as biotin or digoxigenin. The 
modification must be made to the site in such a manner that 
(i) it does not eliminate the protein's ability to bind to 
the cognate binding sequence, (ii) the moiety is accessibl 
to the capturing agent (e.g., in the case of biotin the 

25 agent is streptavidin) in DNA that is not bound to protein, 
and (iii) where the moiety is protected from interactions 
with the capture ag^it whem 1±ie^^otein_ bc^^ 
screening sequence. 

In the second part of this system, the target 

30 oligonucleotide is labelled to allow detection. Labelling 
of the target oligonucleotide can be accomplished by 
standard techniques such as radiolabelling. Alternatively, 
a moi ty such as digoxigenin can be inc rp rated in the 
targ t oligonucleotide and this moiety can then be detected 

35 after captvire. 
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Three emb diments f the captur /detection system 
descaribed by the present disclosure are as follows: 

(i) the target oligonucleotide (containing, for 
example, the screening and test sequences) — modification 

5 of the cognate binding site with biotin and incorporation 
of digoxigenin or radioactivity (eg. , or ^) ; capture of 
the target oligonucleotide using streptavidin attached t 
a solid support; and detection of the target 
oligonucleotide using a tagged anti-digoxigenin antibody or 
10 radioactivity measurement (eg., autoradiography r counting 
in scintillation f luor, or using a phosphoimager) . 

(ii) the target oligonucleotide — modification of the 
cognate binding site with digoxigenin and incorporation of 
biotin or radioactivity; captxire of the target 

15 oligonucleotide using an anti-digoxigenin antibody 
attached to a solid support; and detection of the target 
oligonucleotide using tagged streptavidin or radioactivity 
measurements . 

(iii) separation of the target oligonucleotide which 
20 is bound to protein from the target oligonucleotide which 

is not bound to protein by passing the assay mixtur 
through a nitrocellulose filter under conditions in which 
the protein: DNA complexes are retained by the 
nitrocellulose while the non-protein bound DNA passes 
25 through the nitrocellulose; and detection of the target 
oligonucleotide using radioactivity, tagged anti- 

digoxigenin: digoxigenin — interactions, ^or tagged 

streptavidin : biotin interactions . 

30 Brief Description of the Figures 

Figure lA illustrates a DNA-binding protein binding to 
a screening sequence. Figures IB and IC illustrate how a 
DNA-binding protein may be displac d or hindered in binding 
by a small molecule by two dif f erent m chanisms: because 
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of steric hinderance (IB) or because of c nformational 
(allosteric) changes induced in the DNA by a small molecule 
(IC) . 

Figure 2 illustrates an assay for detecting inhibitory 
5 molecules based on their ability to preferentially hinder 
the binding of a DMA-binding protein to its binding site. 
Protein (O) is displaced from DMA (/) in the presence of 
inhibitor (X) . Two alternative capture/detection systems 
are illustrated, the capture and detection of unboiind DNA 
10 or the capture and detection of DMA: protein complexes. 

Figiure 3 shows a DMA-binding protein that is able to 
protect a biotin moiety, covalently attached to the 
oligonucleotide sequence, from being recognized by the 
streptavidin when the protein is bound to the DNA. 
15 Figure 4A shows the incorporation of biotin emd 

digoxigenin into a typical oligonucleotide molecule for use 
in the assay of the present invention. The oligonucleotide 
contains the binding sequence (i.e., the screening 
sequence) of the DL9 protein, which is underlined, and test 
20 sequences flanking the screening sequence. Figure 4B shows 
the preparation of double-stranded oligonucleotides end- 
labeled with either digoxigenin or ^^P. 

Figure 5 shows a series of sequences that have been 
tested in the assay of the present invention for the 
25 binding of sequence-specific small molecules. 

Figure 6 outlines the cloning of a truncated form of 

toe^in*9_protein, which retein^^ its sequence-specific DNA- 

binding ability (UL9-CO0H) , into an expression vector. 

Figure 7 showis the pVL1393 baculovirus vect r 
30 containing the full length DL9 protein coding sequence. 

Figxire 8 is a photograph of a SDS-polyacrylamide gel 
showing (i) the purified UL9-COOH/glutathione-S-transf erase 
fusi n pr tein and (ii) the UL9-C00H polypeptide. In the 
figure th UL9-C00H polypeptide is indicated by an arrow. 
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Figure 9 shows th ffect on UL9-C00H binding of 
alterations in the test sequences that flanlc t^e UL9 
screening sequence. The data are displayed on band shift 

'^^^'Figure lOA shows the effect of the addition of several 
concentrations of Dista^cin A to ^^^'P^*^^^^^^^ 
reactions utilizing different test sequences. Figure lOB 
shows the effect of the addition of Actinomycm D t 
DL:protein assay reactions utilizing different test 
sequLes. Figure IOC shows the effect of the addxtxon of 
Doxorubicin to DNA:protein assay reactions utilizing 

different test sequences. ^ 4.>,« 

Figure llA illustrates a DNA capture system of the 
present invention utilizing biotin and streptavidin coated 
15 magnetic beads. The presence of the DNA is detected using 
an allcaline-phosphatase substrate that ^-^^ * 
chemiluminescent product. Figure IIB shows a similar 
reaction using biotin coated agarose beads that are 
conjugated to streptavidin, that in turn is conjugated to 



20 the captured DNA. . 

Figure 12 demonstrates a test matrix based 

DNA: protein-binding data. ^ 4.*.^ 

Figure 13 lists the top strands (5'-3') of all the 
possible four base pair sequences that could be used as a 
25 defined set of ordered test sequences in toe assay (for a 
screening sequence having n bases, where n-4). 
- - _Figure-.14^1ists_:the.top_strands 

possible four base pair sequences that have the same base 
composition as the sequence 5'-GATC-3'. This another 
30 example of a defined, ordered set of sequences that could 

be tested in the assay. 

Figure 15 shows an example of an oligonucleotid 
xnolecule containing test sequences flanking a screening 
sequence. The sequence f this molecule is Presented as 
35 S^XD N0:18, Where the «X« of Figure 15 is N xn SEQ ID 
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NO: 18. 

Detailed Description of the invention 

Definitions: 

5 ;^diacent is used to describe the distance relationship 

between two neighboring DNA sites* Adjacent sites are 20 
or less bp apart, or more preferably, 10 or less bp apaurt, 
or even more prefereJDly, 5 or less bp apart, or most 
preferably, immediately abutting one another. "Flanking" 

10 is a synonym for adjacent. 

ppund DNA > as used in this disclosure, refers to the 
DNA that is bound by the protein used in the assay (ie. , in 
the examples of this disclostire, the DL9 protein) • 

pissociation is the process by which two molecules 

15 cease to interact: the process occurs at a fixed average 
rate under specific physical conditions. 

functional binding is the noncovalent association of 
a protein or small molecule to the DNA molecule. In the 
assay of the present invention the functional binding of 

20 the protein to the screening sequence (i.e., its cognate 
DNA binding site) has been evaluated using filter binding 
or gel band-shift experiments. 

Heteromolecules are molecules that are comprised of at 
least two different types of molecules: for example, the 

25 covalent coupling of at least two small orgsmic DNA-binding 
molecules (eg. , distamycin, actinomycin D, or acridine) to 
each other or the covalent coupling of such a DNAjpinding 
molecule (s) to a DNA-binding polymer (eg., a 
deoxyoligonucleotide) . 

30 Qn--rate is herein defined as the time required for two 

molecules to reach steady state association: for exEonple, 
the DNA: protein complex. 

Qff-rat is herein defined as th time required for 
on -half of the associated complexes, e.g., DNA: protein 

35 complexes, to dissociate. 
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e^ ^.^nn^-sneeific binding refers t DNA binding 
molecules which have a strong DNA sequence binding 
preference. For example, restriction enzymes and the 
proteins listed in Table I demonstrate typical sequence- 
5 specific DNA-binding. 

p^^,«T1 P^'^fp-^ential binding refers to DNA binding 

molecules that generally bind DNA but that show preference 
for binding to some DNA sequences over others. Sequence- 
preferential binding is typified by several of the small 

10 molecules tested in the present disclosure, e.g., 
distamycin. sequence-preferential and sequence-specific 
binding can be evaluated using a test matrix such as is 
presented in Figure 12. For a given DNA-binding molecule, 
there are a spectrum of differential affinities for 

IS different DNA sequences ranging from non-sequence-specific 
(no detectable preference) to sequence preferential t 
absolute sequence specificity (ie., the recognition of only 
a single sequence among all possible sequences, as is th 
case with many restriction endonudeases) . 

20 a^r-^ jyi^rr eemience is the DNA sequence that defines 

the cognate binding site for the DNA binding protein: in 
the case of tIL9 the screening sequence can, for example, be 
SEQ ID N0:1. 

C T^ ^ Ii tnoieeules are desirable as therapeutics for 

25 several reasons related to drug delivery: (i) they are 
commonly less than 10 K molecular weight; (ii) they are 

_ more likely J:o_ be _p€^eable_to cells; _ (iii) _ unlike 
peptides or oligonucleotides, they are less susceptible t 
degradation by many cellular mechanisms; and, (iv) they 

30 are not as apt to elicit an immune response. Many 
pharmaceutical companies have extensive libraries of 
chemical and/or biological mixtures, often fungal, 
bacterial, or algal extracts, that would b desirable to 
screen with the assay of the present invention. Small 

35 molecules may be either biological or synthetic rganic 
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compounds, or even inorganic c mp unds (i.e., cisplatin) . 

fo g^. gemience is a DNA sequence adjacent the screening 
sequence. The assay of the present invention screens f r 
molecules that, when bound to the test sequence, affect the 
5 interaction of the DNA-binding protein with its cognate 
binding site {i.e., the screening sequence). Test 
sequences can be placed adjacent either or both ends of th 
screening sequence. Typically, binding of molecules to the 
test sequence interferes with the binding of the DNA- 

10 bdLnding protein to the screening sequence. However, some 
molecules binding to these sequences may have the revers 
effect, causing an increased binding affinity of the DNA- 
binding protein to the screening sequence. Some molecules, 
even while binding in a sequence specific or sequence 

15 preferential manner, might have no effect in the assay. 
These molecules would not be detected in the assay. 

Unbound DNA . as Used in this disclosure, refers to the 
DNA that is not bound by the protein used in the assay 
(ie., in the examples of this disclosure, the UL9 protein) . 

20 

I. The Assay 

One feature of the present invention is that it 
provides an assay to identify small molecules that will 
bind in a. sequence-specific manner to medically significant 

25 DNA target sites. The assay facilitates the development of 
a new field of pharmaceuticals that operate by interfering 
with specific DNA functions, such as crucial ^NA: protein 
int^actionsT^ A sensitive, well-controlled assay to detect 
DNA-binding molecules emd to determine their sequence- 

30 specificity and affinity has been developed. The assay can 
be used to screen large biological and chemical libraries; 
for example, the assay will be used to detect sequence- 
specific DNA-binding molecules in fermentati n broths or 
extracts from various microorganisms. Furthermore, another 

35 application for the assay is to determine the sequence 
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specificity and relative affinities of known DNA-binding 
drugs (and other DNA-binding molecules) for different DNA 
sequences. The drugs, which are primarily used in 
anticancer treatments, may have previously unidentified 
5 activities that make them strong candidates for 
therapeutics or therapeutic precursors in entirely 
different areas of medicine. 

The screening assay is basically a competition assay 
that is designed to test the ability of a molecule t 

10 compete with a DNA-binding protein for binding to a short, 
synthetic, double-stranded oligodeoxynudeotide that 
contains the recognition sequence for the DNA-binding 
protein flanked on either or both sides by a variable test 
site. The variable test site may contain any DNA sequence 

15 that provides a reasonable recognition sequence for a DNA- 
binding molecule. Molecules that bind to the test site 
alter the binding characteristics of the protein in a 
manner that can be readily detected; the extent to which 
such molecules are able to alter the binding 

20 characteristics of the protein is likely to be directly 
proportional to the affinity of the test molecule for th 
DNA test site. The relative affinity of a given molecule 
for different oligonucleotide sequences at the test site 
(i.e., the test sequences) can be established by examining 

25 its effect on the DNA: protein interaction in each of th 
oligonucleotides. The determination of the high affinity 
DNA .binding ^ites^ for_DNA-^iJidijag mo^ 

to^identify specific target sequences for drug development. 

30 A. General Considerations. 

The assay of the present invention has been designed 
for detecting test molecules or compounds that affect the 
rate of transfer f a specific DNA molecule from one 
protein molecule to an ther identical protein in solution. 

35 A mixture of DNA and protein is prepared in solution. 
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The concentration f protein is in excess to the 
concentration of the DNA so that virtually all of the DNA 
is found in DNA:protein complexes. The DNA is a double- 
stranded oligonucleotide that contains the recognition 
sequence for a specific DNA-binding protein (i.e., the 
screening sequence). The protein used in the assay 
contains a DNA-binding domain that is specific for binding 
to the sequence within the oligonucleotide. The physical 
conditions of the solution (e.g. , pH, salt concentration, 
temperature) are adjusted such that the half-life of th 
complex is amenable to performing the assay (optimally a 
half-life of 5-30 minutes) , preferably in a range that is 
close to normal physiological conditions. 

As one DNA: protein complex dissociates, the released 
15 DNA rapidly reforms a complex with another protein in 
solution. Since the protein is in excess to the DNA, 
dissociations of one complex always result in the rapid 
reassociation of the DNA into another DNA: protein complex. 
At equilibrium, very few DNA molecules will be unbound. 
20 The minimum background of the assay is the amount of 
unbound DNA observed during any given measurable time 
period. The brevity of the observation period and the 
sensitivity of the detection system define the lower limits 
of background DNA. 

Figure l illustrates how such a protein can be 
displaced from its cognate binding site or how a protein 
can l)e-prevented_from binding its_Mgnate^inding_^ite , or^ 
how the kinetics of the DNA: protein interaction can be 
altered. One mechanism is steric hinderance of protein 
binding by a small molecule. Alternatively, a molecule may 
interfere with a DNA:protein binding interaction by 
inducing a conformational change in the DNA. In either 
event, if a test molecule that binds th oligonucleotid 
hinders binding of the protein, the rate of transfer of DNA 
from n protein to another will b d creased. This will 
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In 



result in a net increase in the amount of unb und DNA. 
other words, an increase in the amount of unbound DNA or a 
decrease in the amount of bound DNA indicates the presence 

of an inhibitor. 

Alternatively, molecules may be isolated that, when 
bound to the DNA, cause an increased affinity of the DNA- 
binding protein for its cognate binding site. In thxs cas 
the amount of unbound DNA (observed during a given 
measurable tiie period after the addition of the molecule) 
will decrease in the reaction mixture as detected by the 
capture/detection system described in Section II. 

B. Other Hethods 

There are several approaches that could be taken to 
look for small molecules that specifically inhibit the 
interaction of a given DNA-binding protein with its binding 
se«iuence (cognate site) . One approach would be to test 
biological or chemical compounds for their ability to 
preferentially block the binding of one specific 
DNAiprotein interaction but not the others. Such an assay 
would depend on the development of at least two, preferably 
three, DNAtprotein interaction systems in order to 
establish controls for distinguishing between general DNA- 
binding molecules (polycations like heparin or 
intercalating agents like ethidium) and DNA-bindmg 
molecules having sequence binding preferences that would 
-7[ffect-p-rotein/cognate-binding-site-interactions_i^^^ 



35 



system but not the other (s) 

one illustration of how this system could be used is 
as follows. Each cognate site could be placed 5' to a 
reporter gene (such as genes encoding ^-galactoside or 
luciferase) such that binding of the protein to the cognate 
site would enhance transcription f the reporter gene. Th 
presence f a sequence-specific DNA-binding drug that 
blocked the DNAiprotein interaction would decrease the 
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enhancement of th rep rter gene expression. Several DNA 
enhancers could be coupled to reporter genes, then each 
construct compared to one another in the presence or 
absence of small DNA-binding test molecules. In the case 
5 where multiple protein/ cognate binding sites are used for 
screening, a competitive inhibitor that blocks one 
interaction but not the others could be identified by the 
lack of transcription of a reporter gene in a transfected 
cell line or in an in vitro assay. Only one such DNA- 
10 binding sequence, specific for the protein of interest, 
could be screened with each assay system. This approach 
has a number of limitations including limited testing 
capability and the need to construct the appropriate 
reporter system for each different protein/ cognate site of 

15 interest. 

C. Choosing and Testing an Appropriate DNA-Binding 

Protein* 

Experiments performed in support of the present 
invention have defined a second approach for identifying 

20 molecules having sequence-preferential DMA-binding. In 
this approach small molecules binding to sequences adjacent 
the cognate binding sequence can inhibit th 
protein/cognate DNA interaction. This assay has been 
designed to use a single DNA: protein interaction to screen 

25 for sequence-specific or sequence-preferential DNA-binding 
molecules that recognize virtually any sequence. 

While DNA-binding recognition sites are usually quite 

^ small ~(4^7"bp") »"th«^Tequence~^ by the 

binding protein is larger (usually 5 bp or more on either 

30 side of the recognition sequence — as detected by DNAase 
I protection (Galas et al.) or methylation interference 
(Siebenlist et al.). Experiments performed in support of 
the present inventi n demonstrated that a single protein 
and its cognate DNA-binding s quence can be used to assay 

35 virtually any DNA sequenc by placing a sequenc of 
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interest adjac nt to th cognate site: a small molecule 
bound to the adjacent site can be detected by alterations 
in the binding characteristics of the protein to its 
cognate site. Such alterations night occur by either 
5 steric hindrance, which would cause the dissociation of th 
protein, or induced conformational changes in the 
recognition sequence for the protein, which may caus 
either enhanced binding or more likely, decreased binding 
of the protein to its cognate site. 

1) criteria for choosing an appropriate DNA-binding 

protein. 

There are several considerations involved in choosxng 
DNArprotein complexes that can be employed in the assay of 
the present invention including: 

a) The off-rate (see "Definitions") should be 
fast enough to accomplish the assay in a reasonable amount 
of time. The interactions of some proteins with cognate 
sites in DNA can be measured in days not minutes: such 
tightly bound complexes would inconveniently lengthen the 

20 period of time it takes to perform the assay. 

b) The off-rate should be slow enough to allow 
the measurement of unbound DNA in a reasonable amount of 
time. For example, the level of free DNA is dictated by 
the ratio between the time needed to measure free DNA and 

25 the amount of free DNA that occurs naturally due to the 
off -rate during the measurement time period. 

,_ln view of fehe above two consi derat ions , pr actical 

useful DNA:protein off-rates fall in the range of 
approximately two minutes to several days, although shorter 

30 off-rates may be accomodated by faster equipment and longer 
off-rates may be accomodated by destabilizing the binding 

conditions for the assay i 

c) A further consid rati n is that the kinetic 
interactions of the DNA:pr tein complex is relatively 

35 insensitive to the nucleotide sequences flanking the 
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rec gnition sequence. The affinity of many DNA-binding 
proteins is affected by differences in the sequences 
adjacent to the recognition sequence. The most obvious 
example of this phenomenon is the preferential binding aiid 
5 cleavage of restriction enzymes given a choice of several 
identical recognition sequences with different flanking 
sequences (Polinsky et al.). If the off-rates are affected 
by flanking sequences the analysis of coaiparative binding 
data between different flanking oligonucleotide sequenc s 
10 becomes difficult but is not impossible. 

2) Testing DNA: protein interactions f or use in the 
assay. 

Experiments performed in support of the present 
invention have identified a DNA:protein interaction that is 

15 particularly useful for the above described assay: th 
Herpes Simplex Virus (HSV) XJL9 protein that binds the HSV 
origin of replication (oris) . The JJL9 protein has fairly 
stringent sequence specificity. There appear to be three 
binding sites f or UL9 in oris, SEQ ID NOil, SEQ ID N0:2, 

20 SEQ ID NO 1 17 (Elias, P. et al., Stow et al.). One 
sequence (SEQ ID N0:1) binds with at least 10-fold higher 
affinity than the second sequence (SEQ ID NO: 2): th 
embodiments described below use the higher affinity binding 
site (SEQ ID NO:l) . 

25 DNA: protein association reactions are performed in 

solution. The DNA: protein con^lexes can be separated from 
free DNA by amy of se veral meth ods. One particularly 
useful method for the initial study of DNA: protein 
interactions has been visualization of binding results 

30 using band shift gels (Example 3A) . In this method 
DNA: protein binding reactions are applied to 
polyacrylamide/TBE gels and the labelled complexes and free 
label d DNA are separat d electroph retically. Thes gels 
are fixed, dried, and exposed to X-ray film. Th resulting 

35 autoradiograms are examined for the amoxint of free probe 
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that is migrating separately froB the DNA:pr tein complex. 

::::e assays include (i) a lane containing only free 
labeled probe, and (ii) a lane where the sample xs labeled 

probe in the presence of a large excess of bindxng protein. 
The band shift assays allow visualization of the ratios 
between DNA:protein complexes and free probe. However, 
they are less accurate than filter binding assays for rate- 
detLmining experiments due to the lag time between loading 
the gel and electrophoretic separation of the components. 

The filter binding method is particularly useful in 
determining the off-rates for protein: oligonucleotide 
complexes (Example 3B, . In the filter binding assay, 
:Zprotein complexes are retained on a ^i^"^^ 
DNA passes through the filter. «iis assay method is more 
accurate for off-rate determinations because the separation 
of DMArprotein complexes from free probe is very rapxd 
The disadvantage of filter binding is that the natt^e of 
the DNArprotein complex cannot be directly visualized. S 
if for example, the competing molecule was also a protein 
competing for the binding of a site on the ^^A molecule, 
filter binding assays cannot differentiate between th 
binding Of the two proteins nor yield information about 
whether one or both proteins are binding. 

There are many known DNAiprotein interactions that may 
be useful in the practice of the present invention 
including (i) the DNA protein interactions listed m Table 

-I (M) bacterial,_y_east,_and„phage_^^^^^ 
o -o,/cro, and (iii) modified restriction enzyme systems 
re g., protein binding in the absence of divalent cations) . 
Ly protein that binds to a specific recognition sequence 
„ay be useful in the present invention. One constraining 
factor is the effect of the immediately adjacent sequences 
(the test sequences) on the affinity f the pr tein for its 
rec gnition sequence. DNA:protein interactions in which 
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there is little or no effect of th test sequences on the 
affinity of the protein for its cognate site eare preferable 
for use in the described assay; howevier, DNA; protein 
interactions that exhibit (test sequence-dependent) 
5 differential binding may still be useful if algorithms are 
applied to the analysis of data that compensate for the 
differential affinity. In general, the effect of flanking 
sequence coxr^josition on the binding of the protein is 
likely to be correlated to the length of the recogniti n 

10 sequence for the DNA-binding protein. In short, the 
kinetics of binding for proteins with shorter recogniti n 
sequences are more likely to suffer from flanking sequence 
effects, while the kinetics of binding for proteins with 
longer recognition sequences are more likely to not be 

15 affected by flemking sequence composition. The present 
disclosure provides methods and guidance for testing the 
usefulness of such DNA: protein interactions, i.e., other 
than the DL9 oris binding site interaction, in th 
screening assay. 

20 D. Preparation of Pull Length UL9 and UL9-C00H 

Polypeptides. 

UL9 protein has been prepared by a niimber of 
recombinant techniques (Example 2) . The full length UL9 
protein has been prepared from baculovirus infected insect 

25 cultures (Example 3A, B, and C) . Further, a portion of the 
0L9 protein that contains the DNA-binding domain (UL9-C00H) 
has . been cloned into a bacteria l e a^ression vec tor and 
produced by bacterial cells (Exsunple 3D and E) . The DNA- 
binding domain of UL9 is contained within the c-terminal 

30 317 amino acids of the protein (Weir et al.). The DL9- 
COOH polypeptide was inserted into the expression vect r 
in-frame with the glutathione-S-transferase (gst) protein. 
The gst/UL9 fusion protein was purified using affinity 
chromatography (Exampl 3E) . The vector also contained a 

35 thrombin cleavage site at the junction f the two 
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polypeptides. Th ref re, once the fusxon protexn was 
IsoYJe^ (Figure 8, lane 2) it was treated with thro^xn 
cleaving the UL9-C00H/gst fusion protein from the gst 
polypeptide (Figure 8, lane 3). The ULS-COOH-gst fusion 
polypeptide was obtained at a protein purity of greater 
than 95% as determined using Coomaisie staining. 

Other hybrid proteins can be utilized to prepare DNA- 
binding proteins of interest. For example, fusing a DNA- 
binding protein coding seguence in-frame with * b-^-^ 
encoding the thrombin site and also in-frame wxth the ^- 
galactoside coding sequence. Such hybrid proteins can b 
isolated by affinity or immunoaffinity columns (Manxatxs et 
al ; Pierce, Rockford IL) . Further, DNA-binding protexns 
can be isolated by affinity chromatography based on their 
ability to interact with their cognate DNA binding sxte. 
For example, the tJL9 DNA-binding site (SEQ ID NO:!) can be 
covalently linked to a solid support (e.g., CnBr-activated 
sepharose 4B beads, Pharmacia, Piscataway NJ) , extracts 
passed over the support, the support washed, and the DNA- 
binding then isolated from the support with a salt gradxent 
(Kadonaga). Alternatively, other expression systems in 
bacteria, yeast, insect cells or mammalian cells can be 
used to express adequate levels of a DNA-binding protein 

for use in this assay. 

The results presented below in regard to the DNA- 
binding ability of the truncated m.9 protein suggest that 
full-l«igth DNA-binding_proted^^ejio^requ^ed for th 
DNA:protein assay of the present invention: only a portxon 
of the protein containing the cognate site recognxtxon 
function may be required. The portion of a DNA-bindxng 
protein required for DNA-binding can be evaluated usxng a 
functional binding assay (Example 4A) . The rate of 
diss ciation can be valuated (Exampl 4B) and compared to 
that of the full length DNA-binding protein. However, any 
DNA-binding P ptide, truncated or full length, may be used 
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in the assay if it meets th cariteria outlined in part 
I.C.I, "Criteria for choosing an appropriate DNA-binding 
protein". This reiaains true whether or not the truncated 
form of the DNA-binding protein has the same affinity as 
5 the full length DNA-binding protein. 

E. Functional Binding and Rate of Dissociation. 
The full length DL9 and purified IJL9-C00H proteins 
were tested for functional activity in "band shift" assays 
(see Example 4A) . The buffer conditions were optimized f r 

10 DNA: protein-binding (Example 4C) using the DL9-C00H 
polypeptide. These DNA-binding conditions also worked well 
for the full-length DL9 protein. Radiolabelled 
oligonucleotides (SEQ ID NO: 14) that contained the 11 bp 
UL9 DNA-binding recognition sequence (SEQ ID N0:1) were 

15 mixed with each DL9 protein in appropriate binding buffer. 
The reactions were incubated at room temperature for 10 
minutes (binding occurs in less than 2 minutes) and th 
products were separated electrophoretically on non- 
denaturing polyacrylamide gels (Example 4A) . The degree of 

20 DNA: protein-binding could be determined from the ratio of 
labeled probe present in DNA: protein complexes versus that 
present as free probe. This ratio was typically 

determined by optical scemning of autoradiograms and 
comparison of band intensities. Other standard methods may 

25 be used as well for this deteinnination, such as 
scintillation counting of excised beuids. The nL9-COOH 
polypeptide and the full length UL9 po lypeptide j_in their 
respective buffer conditions, bound the teurg t 
oligonucleotide equally well. 

30 The rate of dissociation was determined using 

competition assays. An excess of unlabelled 

oligonucleotide that contained the DL9 binding site was 
added to each reaction. This unlabelled oligonucleotide 
acts as a specific inhibitor, capturing th UL9 pr tein as 

35 it diss ciates from the labelled ligonude tide (Example 
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4B). The dissociation rate, as determined by a band-shift 
assay, for both full length UL9 and IIL9-C00H was 
approxiBiately 4 hours at 4-C or approximately 10 minutes at 
room temperature. Neither non-specific oligonucleotides (a 
5 10,000-fold excess) nor sheared herring sperm DNA (a 
100,000-fold excess) competed for binding with the 
oligonucleotide containing the DL9 binding site. 

F. oris Flanking Sequence Variation. 

10 AS mentioned above, one feature of a DNArprotein- 

binding system for use in the assay of the present 
invention is that the DNArprotein interaction is not 
affected by the nucleotide sequence of the regions adjacent 
the DNA-binding site. The sensitivity of any DNAtprotein- 

15 binding reaction to the composition of the flanking 
sequences can be evaluated by the functional binding assay 
and dissociation assay described above. 

To test the effect of flanking sequence variation n 
DL9 binding to the oris SEQ ID Kb:l sequences 

20 oligonucleotides were constructed with 20-30 different 
sequences (i.e., the test sequences) flanking the 5' and 3' 
sides of the DL9 binding site. Further, oligonucleotides 
were constructed with point mutations at several positions 
within the 0L9 binding site. Most point mutations within 

25 the binding site destroyed recognition. Several changes 
did not destroy recognition and these include variations at 
^iteB^that-differ-between-the-three-DL9-binding_8ites_(SEQ 

ID NOil, SEQ ID.NO:2 and SEQ ID N0:17) : the second DL9 
binding site (SEQ ID N0:2) shows a ten-fold decrease in 

30 DL9:DNA binding affinity (Elias et al.) relative to the 
first (SEQ ID NO:l) . On the other hand, sequence variation 
at the test site (also called the test sequence), adjacent 
to th screening site (Figure 5, Example 5), had virtually 
no effect on binding or the rate of dissociation. 

35 The results dem nstrating that the nucleotide sequence 
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in the test site, which flanks the screening site, has no 
effect on the kinetics of UL9 binding in any of the 
oligonucleotides tested is a striking result. This allows 
the direct comparison of the effect of a DNA-binding 
5 molecule on test oligonucleotides that contain different 
test sequences* Since the only difference between test 
oligonucleotides is the difference in nucleotide sequence 
at the test site(s), and since the nucleotide sequence at 
the test site has no effect on UL9 binding, any 

10 differential effect obsexrved between the two test 
oligonucleotides in response to a DNA-binding molecule Ejust 
be due solely to the differential interaction of the DNA- 
binding molecule with the test sequence (s). In this 
manner, the insensitivity of UL9 to the test sequences 

15 flanking the UL9 binding site greatly facilitates the 
interpretation of results. Each test oligonucleotide acts 
as a control sample for all other test oligonucleotides. 
This is particularly true when ordered sets of test 
sequences are tested (eg., testing all 256 fovir base pair 

20 sequences (Figure 13) for binding to a single drug) . 

Taken together the above experiments support that the 
UL9-C00H polypeptide binds the SEQ ID N0:1 sequence with 
(i) appropriate strength, (ii) an acceptable dissociation 
time, and (iii) indifference to the nucleotide sequences 

25 flanking the assay (binding) site. These features 
suggested that the JJhS/oriS system could provide a 
versatile assay for detection of small molecule/DNA-binding 

involving any~lium^ ^f^ecific nucleotide sequences. 

The above-described experiment can be used to screen 

30 other DNA: protein interactions to determine their 
usefulness in the present assay. 

G. Small Molecules as Sequehc -Specific Competitive 
Inhibitors. 

35 To test the utility of the present assay system 
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several small molecul s that have sequ nee preferences 
(e.g., a preference for AT-rich versus GC-rich sequences) 

have been tested. 

Distamycin A binds relatively weakly to DNA (K^ = 2 x 
5 lO' irM with a preference for non-alternating AT-rich 
sequences (Jain et al.; Sobell; Sobell et al.). 
Actinomycin D binds DNA iaore strongly (K^ = 7.6 x lO'"' H"^) 
than Distamycin A and has a relatively strong preference 
for the dinucleotide sequence dGdC (Luck et al.; Zimmer; 

10 Wartel) . Each of these molecules poses a stringent test 
for the assay. Distamycin A tests the sensitivity of the 
assay because of its relatively weak binding. Actinomycin 
D challenges the ability to utilize flanking sequences 
since the UL9 recognition sequence contains a dGdC 

15 dinucleotide: therefore, it might be anticipated that all 
of the oligonucleotides, regardless of the test sequence 
flanking the assay site, might be equally affected by 
actinomycin D. 

In addition, Doxorubicin, a known anti-cancer agent 

20 that binds DNA in a sequence-preferential manner (Chen, K- 
X, et al.), has been tested for preferential DNA sequence 
binding using the assay of the present invention. 

Actinomycin D, Distamycin A, and Doxorubicin have been 
tested for their ability to preferentially inhibit the 

25 binding of tJL9 to oligonucleotides containing different 
sequences flanking the UL9 binding site (Example 6, Figure 
5). Binding assays were performed as described in Example 
— - ^^^^ studies were completed under conditions in which 
DL9 is in excess of the DNA (i.e., most of the DNA is in 

30 complex) . 

Distamycin A was tested with 5 different test 
sequences flanking the UL9 screening sequence: SEQ ID NO: 5 
to SEQ ID NO: 9. The results shown in Figur lOA 
demonstrate that distamycin A preferentially disrupts 
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binding to the test sequences UL9 polyT, UL9 polyA and, to 
a lesser extent, UL9 ATAT. Figiare lOA also shows the 
concentration dependence of the inhibitory effect f 
distamycin A: at 1 fM distamycin A most of the DNAiprotein 
5 complexes are intact (top band) with free probe appearing 
in the UL9 polyT and UL9 polyA lanes, and some free probe 
appearing in the UL9 ATAT lane; at 4 fM free probe can b 
seen in the UL9 polyT and UL9 polyA lanes; at 16 /xM free 
probe can be seen in the UL9 polyT and UL9 polyA lanes; and 

10 at 40 MM the DNA: protein in the polyT, UL9 polyA and DL9 
ATAT lanes are near completely disrupted while som 
DNA: protein complexes in the other lanes persist. These 
results are consistent with Distamycin A's known binding 
preference for non-alternating AT-rich sequences. 

15 Actinomycin D was tested with 8 different , test 

sequences flanking the UL9 screening sequence: SEQ ID NO: 5 
to SEQ ID NO: 9, and SEQ ID NO: 11 to SEQ ID NO: 13, The 
results shown in Figxire lOB demonstrate that actinomycin D 
preferentially disrupts the binding of UL9-C00H to the 

20 oligonucleotides UL9 CCCG (SEQ ID NO: 5) and DL9 GGGC (SEQ 
ID N0:.6). These oligonucleotides contain, respectively, 
three or five dGdC dinucleotides in addition to the dGdC 
dinucleotide within the UL9 recognition sequence. This 
result is consistent with Actinomycin D's known binding 

25 preference for the dinucleotide sequence dGdC. Apparently 
the presence of a potential target site within the 
screening sequence (oris, SEQ ID N0:1), as mentioned above, 
does not Tnterfere wi«r the" fxinction of"^ assay . 

Doxoirubicin was tested with 8 different test sequences 

30 flanking the UL9 screening sequence: SEQ ID NO: 5 to SEQ ID 
NO: 9, and SEQ ID NO: 11 to SEQ ID NO: 13. The results shown 
in Figure IOC demonstrate that Doxorubicin preferentially 
disrupts binding to riEco3, the t st sequence of which 
differs from oriEco2 by only one base (compare SEQ ID NO: 12 

35 and SEQ ID NO: 13) . Figure IOC also shows the concentration 
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dependence of the inhibitory effect of Doxorubicin: at 15 
UK Doxorubicin, the UL9 binding to the screening sequence 
is strongly affected when oriEco3 is the test sequence, and 
more mildly affected when polyT, UL9 GGGC, or oriEco2 was 
5 the test sequence; and at 35 fM Doxorubicin most 
DNA: protein complexes are nearly completely disrupted, with 
DL9 polyT and DL9ATAT showing some DNA still complexed with 
protein. Also, effects similar to those observed at 15 /iM 
were also observed using Doxorubicin at 150 nM, but at a 

10 later time point. 

Further inctibation with any of the drugs resulted in 
additional disruption of binding. Given that the one hour 
incubation time of the above assays is equivalent to 
several half -lives of the DNA: protein complex, the 

15 additional disruption of binding suggests that the on-rate 
for the drugs is comparatively slow. 

The ability of the assay to distinguish sequence 
binding preference using weak DNA-binding molecules with 
poor sequence-specificity (such as distamycin A) is a 

20 stringent test. Accordingly, the present assay seems well- 
suited for the identification of molecules haying better 
sequence specificity and/ or higher sequence binding 
affinity. Further, the results demonstrate seguenc 
preferential binding with the known anti-cancer drug 

25 Doxorubicin. This result indicates the assay may be useful 
for screening mixtures for molecules displaying similar 

chara cte ris tics th at co uld be subsequently tested for anti- 
cancer activities as well as sequence-specific binding. 

Other compounds that may be suitable for testing the 

30 present DNA: protein system or for defining alternate 
DNA: protein systems include the following: echinomycin, 
which preferentially binds to the sequence (A/T)CGT 
(Quigley et al.); small inorganic molecules, such as 
cobalt hexamine, that are known to induce Z-DNA formation 

35 in regions that contain repetitive GC sequences (Gessner et 
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al.); and other DNA-binding prot ins, such as EcoRl, a 
restriction endonuclcase. 

H. Theoretical considerations on the concentration of 

5 assay conponents. 

There are two components in the assay, the test 
sequence (oligonucleotide) and the DNA-binding domain of 
UL9, which is described below. A number of theoretical 
considerations have been employed in establishing the assay 

10 system of the present invention. In one embodiment of th 
invention, the assay is used as a mass-screening assay. In 
this capacity, small volumes and concentrations were 
desirable. A typical assay uses about 0.1 ng DNA in a 15- 
20 Ml reaction volume (approximately 0.3 nM) . The protein 

15 concentration is in excess and can be varied to increase or 
decrease the sensitivity of the assay. In the simplest 
scenario, where the small molecule is acting as a 
competitive inhibitor via steric hindrance, the system 
kinetics can be described by the following equations: 

20 

D + P D:P, where k^/k^p = K^.^ = CD:P]/[D][P] 

and 

25 D + X D:X, where kft/k^ = K.^, = [D:X]/[b][X] 

D = DNA, P = protein, X = DNA-binding molecule, 

Wd~k^ are' the" rates" of the "forwarder eactiw 

for the DNA: protein interaction and DNA: drug 
30 interaction, respectively, and k^p and k^ are the 

rates of the backwards reactions for the 
respective interactions. Brackets, [], indicate 
molar concentration of the components. 
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in the assay, both the protein, P, and th DNA-binding 
molecule or drug, X, are competing for the DNA. If steric 
hindrance is the mechanism of inhibition, the assumption 
can be made that the two molecules are competing for th 
5 same site. When the concentration of DNA equals th 
concentration of the DNA: drug or DNA: protein complex, the 
equilibrium binding constant, K„, is equal to the 
reciprocal of the protein concentration (1/[P]). For UL9, 
the calculated = 2.2 x 10» IT*. When all three 

10 components are mixed together, the relationship between the 
drug and the protein CeUi be described as: 



15 



20 



Where "z" defines the difference in affinity for the DNA 
between P and X. For example, if z =4, then the affinity 
of the drug is 4-fold lower than the affinity of the 
protein for the DNA molecule. The concentration of X, 
therefore, must be 4-fold greater than the concentration of 
P, to compete equally for the DNA molecule. Thus, the 
equilibrium affinity constant of UL9 will define the 
minimum level of detection with respect to the 
concentration and/or affinity of the drug. Low affinity 
DNA-binding molecules will be detected only at high 
25 concentrations; likewise, high affinity molecules can be 
detected at relatively low concentrations. 

With certain test sequences , cOTpl^ete^ i^nhibit^ 
^ 13L9 binding at~markedly lower concentrations than indicated 
by these analyses have been observed, probably indicating 
that certain sites among those chosen for feasibility 
studies have affinities higher than previously published. 
Note that relatively high concentrations of known drugs can 
be utilized for t sting sequence specificity. In additi n, 
the binding constant of UL9 can be readily lowered by 



30 
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altering the pH or salt cone ntrati n in the assay if it is 
desirable to screen for molecules that are found at low 
concentration (eg. , in a fermentation broth or extract) . 
Analyses such as presented cd>ove, become more complex 
5 if the inhibition is allosteric (non-competitive 
inhibition) rather them competition by steric hindrance. 
Nonetheless, the probability that the relative effect of an 
inhibitor on different test sequences is due to its 
relative and differential affinity to the different test 

10 sequences is fairly high. This is particularly true in the 
assays in which all sequences within an ordered set (eg., 
possible sequences of a given length or all possibl 
variations of a certain base composition and defined 
length) are tested. In brief, if the effect of inhibition 

15 in the assay is particularly strong for a single sequence, 
then it is likely that the inhibitor binds that particular 
sequence with higher affinity than any of the other 
sequences. Furthermore, while it may be difficult to 
determine the absolute affinity of the inhibitor, th 

20 relative affinities have a high probability of being 
reasonably accurate. This information will be most useful 
in facilitating, for instance, the refinement of molecular 
modeling systems. 

I. The use of the assay under conditions of high 

25 protein concentration. 

When the screening protein is added to the assay 
system at very high concentrations, the protein binds to 

non*specif ic sites" on' the ' o lig^nucleot ide ~ in addition to 

the screening sequence. This effect has been demonstrated 

30 using band shift gels: in particular, when serial dilutions 
are made of the UL-9 protein and the dilutions are mixed 
with a fixed concentration of oligonucleotide, no binding 
(as seen by a band shift) is observed at very low dilutions 
(e.g., 1:100,000), a single band shift is observed at 

35 moderate dilutions (e.g., 1:100) and a smear, migrating 
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higher than the single band observed at moderated 
dilutions, is Observed at high concentrations of protexn 
(eg l:10). m the band shift assay, a smear is 
indicative of a mixed population of complexes, all of which 
5 presumably have the screening protein binding to the 
screening sequence with high affinity (e.g., for 0L9, - 
1.1 X 10' M-») but in addition have a larger number of 
proteins bound with markedly lower affinity. 

some of the low affinity binding proteins are bound to 
10 the test sequence. In experiments performed in support of 
the present invention, using , mixtures of DL9 and 
glutathione-S-transferase, the low affinity bindxng 
proteins are likely UL9 or, less likely, glutathione-S- 
transferase, since these are the only proteins in the assay 
15 mixture. These low affinity binding proteins are 
significantly more sensitive to interference by a molecule 
binding to the test sequence for two reasons. First, the 
interference is likely to be by direct steric hinderance 
and does not rely on induced conformational changes m th 
20 D»A; secondly, the protein binding to the test site is a 
low affinity binding protein because the test site is not 
a cognate-binding sequence. In the case of XJL9, the 
difference in affinity between the low affinity binding and 
the high affinity binding appears to be at least two orders 

25 of magnitude. 

Experiments performed in support of the present 
invention demonstrate that the filter binding assays 

capture more DNATprotein ^complexes-when more protein- is ^ 

bound to the DNA. The relative results are accurate, but 

30 under moderate protein concentrations, not all 

DNA (as demonstrated by band shift assays) will bind to the 
filter unless there is more than one DNA:protein complex 
per oligonucleotide (e.g., in th case of UL9, more than 
one UL9:DNA complex). This makes the assay exquisitely 
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sensitive under conditions of high protein concentration. 
For instance, when actinomycin binds DNA at a test site 
under conditions where there is one DNA:DL9 complex per 
oligonucleotide, a differential-binding effect on GC-rich 
5 oligonucleotides has been observed (see Example 6) . Under 
conditions of high protein concentration, where more than 
one DNA:DL9 complex is foimd per oligonucleotide, the 
differential effect of actinomycin D is even more marked. 
These results suggest that the effect of actinomycin D on 
10 a test site that is weakly bound by protein may be more 
readily detected than the effect of actinomycin D on the 
adjacent screening sequence. Therefore, employing high 
protein concentrations may increase the sensitivity of the 
assay. 

15 

II. capttire/Detection Systems. 

As an alternative to the above described band shift 
gels and filter binding assays, the measurement f 
inhibitors can be monitored by measxiring either the level 

20 of unbound DNA in the presence of test molecules r 
mixtures or the level of DNA: protein complex remaining in 
the presence of test molecules or mixtures. Measurements 
may be made either at equilibrium or in a kinetic assay, 
prior to the time at which equilibrium is reached. The 

25 type of measurement is likely to be dictated by practical 
factors, such as the length of time to equilibrium, which 
will be determined by both the kinetics of the DNA; protein 

. interactiorT" as "well "as" the" kinetrcs"" of ^he DNA: <Srug 

interaction. The results (ie., the detection of DNA- 

30 binding molecules and/ or the determination of their 
sequence preferences) should not vary with the type of 
measurement taken (kinetic or equilibriim) . 

Figure 2 illustrates an assay for det cting inhibitory 
molecules based on their ability to preferentially hinder 

35 the binding of a DNA-binding protein. In the presence of 



wo 93/00446 



PCr/US92/05476 



36 

an inhibit ry molecule (X) the equilibrium between the DNA- 
binding protein and its binding site (screening sequence) 
is disrupted. The DNA-binding protein (O) is displaced 
from DNA (/) in the presence of inhibitor (X), the DNA fre 
5 of protein or, alternatively, the DNArprotein complexes, 
can then be captured and detected. 

For maximum sensitivity, xanbound DNA and DNA: protein 
complexes should be sequestered from each other in an 
efficient and rapid manner. The method of DNA capture 
10 should allow for the rapid removal of the unbound DNA from 
he protein-rich mixture containing the DNArprotein 
complexes. 

Even if the test molecules are specific in their 
interaction with DNA they may have relatively low affinity 

15 and they may also be weak binders of non-specific DNA or 
have non-specific interactions with DNA at low 
concentrations. In either case, their binding to DNA may 
only be transient, much like the transient binding of the 
protein in solution. Accordingly, one feature of the assay 

20 is to take a molecular snapshot of the equilibri\im state of 
a solution comprised of the target/ assay DNA, the protein, 
and the inhibitory test molecule. In the presence of an 
inhibitor, the amount of DNA that is not bound to protein 
will be greater than in the absence of an inhibitor. 

25 Likewise, in the presence of an inhibitor, the amount of 
DNA that is bound to protein will be lesser than in the 
absence of an inhibitor. Ai^ method jused to^s^arate^the^ 
DNArprotein complexes from unbound DNA, should be rapid, 
because when the capture system is. applied to the solution 

30 (if the capture system is irreversible) , the ratio of 
unbound DNA to DNArprotein complex will change at a 
predetermined rate, based purely on the off-rate of the 
DNArprotein complex. This step, therefore, determines the 
limits of background. Unlike th protein and inhibitor, 

35 the capture system should bind rapidly and tightly to the 
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DNA or DNA: protein complex* The longer the capture system 
is left in contact with the entire mi)cture of tinbovind DNA 
and DNA: protein complexes in solution, the higher the 
background, regardless of the presence or absence of 
5 inhibitor. 

Two exemplary captture systems are described below for 
use in the present assay. One capture system has been 
devised to capture unbound DNA (part II. A). The other has 
been devised to capture DNA: protein complexes (part II, B) . 
10 Both systems are amenable to high throughput screening 
assays. The same detection methods can be applied to 
molecules captxired using either capture system (part II .C) 

A. Capture of unbound DNA. 

One capture system that has been developed in the 

15 course of experiments performed in support of the present 
invention utilizes a streptavidin/biotin interaction for 
the rapid capture of unbound DNA from the protein-rich 
mixtxire, which includes unbound DNA, DNA: protein complexes, 
excess protein and the test molecules or test mixtures. 

20 Streptavidin binds with extremely high affinity to biotin 
(Ktf = IQ-^^M) (Chaiet et al.; Green), thus two advantages of 
the streptavidin/biotin system are that binding between the 
two molecules can be rapid and the interaction is the 
strongest known non-covalent interaction. 

25 In this detection system a biotin molecule is 

covalently attached in the oligonucleotide screening 
sequence (i.e., the DNA-binding protein's binding site). 

This attaclmeht is ITccbmpl ished~ in such ^ manner that the 

binding of the DNA-binding protein to the DNA is not 

30 destroyed. Fxirther, when the protein is bound to the 
biotinylated sequence, the protein prevents the binding of 
streptavidin to the biotin. In other words, the DNA- 
binding protein is able to protect the biotin from being 
recognized by the streptavidin. This DNA: protein 
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interaction is illustrated in Figiire 3. 

The capture system is described herein for use with 
the m.9 /oris system described above. The following general 
testing principles can, however, be applied to analysis of 
5 other DNA: protein interactions. The usefulness of this 
system depends on the biophysical characteristics of the 
particuleu: DNA: protein interaction. 

1) Modification of the protein' recognition 

sequence with biotin. 

10 The recognition sequence for the binding of the UL9 

(Koff et al.) protein is underlined in Figure 4. 
Oligonucleotides were synthesized that contain the UL9 
binding site and site-specifically biotinylated a number of 
locations throughout the binding sequence (SEQ ID N0:14; 

15 Example 1, Figure 4) . These biotinylated oligonucleotides 
were then used in band shift assays to determine the 
ability of the DL9 protein to bind to the oligonucleotide. 
These experiments using the biotinylated probe and a non- 
biotinylated probe as a control demonstrate that the 

20 presence of a biotin at the #8-T (biotinylated 
deoxyuridine) position of the bottom strand meets the 
requirements listed above: the presence of a biotin moiety 
at the #8 position of the bottom strand does not markedly 
affect the specificity of tJL9 for the recognition site; 

25 further, in the presence of bound UL9, streptavidin does 
not recognize the presence of the biotin moiety in the 

oligonucleo tide. Biotinylation at other ^ or T positions^ 

did not have the two necessary characteristics (i.e., UL9 
binding and protection from streptavidin) : biotinylation 

30 at the adenosine in position #8, of the top strand, 
prevented the binding of UL9; biotinylation of either 
adenosines or thymidines (top or bottom strand) at 
positions #3, #4, #10, or #11 all allowed binding of UL9, 
but in each case, streptavidin also was able to recognise 

35 the presence of the biotin moiety and thereby bind the 
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oligonucleotide in the pres nc of UL9. 

The above result (the ability of UL9 to bind to an 
oligonucleotide containing a biotin within the recognition 
sequence and to protect the biotin from streptavidin) was 
5 unexpected in that methylation interference data (Kbff et 
al.) suggest that methylation of the deoxyguanosine 
residues at positions #7 and #9 of the recognition sequence 
(on either side of, the biotinylated deoxyuridine) blocks 
XJL9 binding. in these methylation interference 

10 experiments, guanosines are methylated by dimethyl sulfate 
at the n' position, which corresponds structurally to the 5- 
position of the pyrimidine ring at which the deoxyuridine 
is biotinylated. These moieties all protrude into the 
major groove of the DNA. The methylation interference data 

15 suggest that the #7 and #9 position deoxyguanosines are 
contact points for UL9, it was therefore unexpected that 
the presence of a biotin moiety between them would n t 
interfere with binding. 

The binding of the full length protein was relatively 

20 unaffected by the presence of a biotin at position #8 
within the UL9 binding site. The rate of dissociation was 
similar for full length UL9 with both biotinylated and un- 
biotinylated oligonucleotides. However, the rate of 
dissociation of the truncated UL9-C00H polypeptide was 

25 faster with the biotinylated oligonucleotides than with 
non-biotinylated oligonucleotides, which is a rate 
comparable to that of the full length p^tein with eito 

~^ DNA. 

The binding conditions were optimized for UL9-C00H so 
30 that the off -rate of the truncated UL9 from the 
biotinylated oligonucleotide was 5-10 minutes (optimized 
conditions are given in Example 4 ) , a rate compatible with 
a mass screening assay. The use of multi-well plates to 
conduct the DNA: protein assay of the present invention is 
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one approach to mass screening. 

2) capttxre of site-specific biotinylated 

oligonucleotides. 

The streptavidin:biotin interaction can be employed in 
5 several different ways to remove unboiind DNA from the 
solution containing the DNA, protein, and test molecule or 
mixture. Magnetic polystyrene or agarose beads, to which 
streptavidin is covalently attached or attached through a 
covalently attached biotin, can be exposed to the solution 
10 for a brief period, then removed by use, respectively, of 
magnets or a filter mesh. Magnetic streptavidinated beads 
are currently the method of choice, streptavidin has been 
used in many of these experiments, but avidin is equally 
useful. 

15 An example of a second method for the removal of 

unbound DNA is to attach streptavidin to a filter by first 
linking biotin to the filter, binding streptavidin, then 
blocking nonspecific protein binding sites on the filter 
With a nonspecific protein such as albumin. The mixture is 

20 then passed through the filter, unbound DNA is captured and 
the bound DNA passes through the filter. 

one convenient method to sequester captured DNA is the 
use of streptavidin-conjugated superparamagnetic 
polystyrene beads as described in Example 7. These beads 

25 are added to the assay mixture to capture the unbound DNA. 
After capture of DNA, the beads can be retrieved by placing 
the reaction tubes in a magnetic rack^jwhi 
beads on tiie reaction chamber wall while the assay mixture 
is removed and the beads are washed. The captured DNA is 

30 then detected using one of several DNA detection systems, 
as described below. 

Alternatively, avidin-coated agarose beads can be 
used. Biotinylated agaros beads (immobilized D-biotin, 
Pierce) are bound to avidin. Avidin, like streptavidin, 

35 has four binding sites for biotin. One of these binding 
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sites is us d to bind the avidin to th biotin that is 
coupled to the agarose beads via a 16 atom spacer ana: the 
other biotin binding sites remain available. The beads are 
mixed with binding mixtures to capture biotinylated DNA 
5 (Example 7). Alternative methods (Harlow et al.) to the 
bead capture methods just described include the following 
streptavidinated or avidinated supports : low-protein- 
binding filters or 96-well plates* 

B) Capture of DNA: protein complexes. 

10 The aimount of DNA: protein complex remaining in th 

assay mixture in the presence of an inhibitory molecule can 
also be determined as a measiire of the relative effect of 
the inhibitory molecule. A net decrease in the amount of 
DNA: protein complex in response to a test molecule is an 

15 indication of the presence of an inhibitor. DNA molecules 
that axB bound to protein can be captured on nitrocellulose 
filters. Under low salt conditions, DNA that is not bound 
to protein freely passes through the filter. Thus, by 
passing the assay mixture rapidly through a nitrocellulose 

20 filter, the DNA: protein complexes and unbound DNA molecules 
can be rapidly separated. This has been accomplished on 
nitrocellulose discs using a vacuum filter apparatus or on 
slot blot or dot blot apparatuses (all of which are 
available from Schleicher and Schuell, Keene, NH) . The 

25 assay mixture is applied to and rapidly passes through the 
wetted nitrocellulose under vacuum conditions. Any 
apparatus employing nitrocellulose filters or other filters 
capable of ^tTiiiingrprotein^whiie allowing free DNA to 
pass through the filter are suitable for this system. 

30 C) Detection systems. 

For either of the above capture methods, the amount of 
DNA that has been captured is quantitated. The method of 
quantitation depends on how th DNA has been prepared. If 
the DNA is radioactively labelled, beads can be counted in 

35 a scintillation counter, or autoradiographs can be taken of 
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dried gels or nitrocellulose filters. The am unt of DNA 
has been quantitated in the latter case by a densitometer 
(Molecular Dynamics, Sunnyvale, CA) ; alternatively, 
filters or gels containing radiolabeled samples can be 
quantitated using a phosphoimager (Molecular Dynamics). 
The captured DNA may be also be detected using a 
chemiluminescent or colorimetric detection system. 

Radiolabelling and chemiluminescence (i) are very 
sensitive, allowing the detection of sub-femtomole 
quantities of oligonucleotide, and (ii) use well- 
established techniques. In the case of chemiluminescent 
detection, protocols have been devised to accommodate the 
requirements of a mass-screening assay. Non-isotopic DNA 
detection techniques have principally incorporated alkaline 
phosphatase as the detectable label given the ability of 
the enzyme to give a high turnover of substrate to product 
and the availability of substrates that yield 
chemiluminescent or colored products. 

1) Radioactive labeling. 

Many of the experiments described above for DL9 
DNArprotein-binding studies have made use of radio-labelled 
oligonucleotides. The techniques involved in 

radiolabelling of oligonucleotides have been discussed 
above. A specific activity of 10«-10' dpm per Mg DNA is 
routinely achieved using standard methods (eg., end- 
labeling the oligonucleotide with adenosine 7-t»P]-5' 
triphosphate and- T4 polynucleotide kinase)_.___This^el of 
specific activity allows small amounts of DNA to be 
measured either by autoradiography of gels or filters 
exposed to film or by direct counting of samples in 
scintillation fluid. 

2) Chemiluminescent detection. 

For chemiluminesc nt det ction, digoxigenin-labelled 
oligonucleotides (Example 1) can be detected using th 
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chemilximinescent detection system "SOUTHERN LIGHTS," 
developed by Tropix, Inc. The detection system is 
diagrammed in Figures llA and IIB. The technique can be 
applied to detect DNA that has been captiired on either 
5 beads, filters, or in solution. 

Alkaline phosphatase is coupled to the captured DNA 
without interfering with the capture system. To do this 
several methods, derived from commonly used ELISA (Harlow 
et al.; Pierce, Rockford IL) techniques, can be employed- 

10 For example, an antigenic moiety is incorporated into the 
DNA at sites that will not interfere with (i) the 
DNA:protein interaction, (ii) the DNA: drug interaction, or 
(iii) the capture system. In the DL9 DNA:protein/biotin 
system the DNA has been end-labelled with digoxigenin-11- 

15 dUTP (dig-dUTP) and terminal transferase (Example 1, Figur 
4) . After the DNA was captured and removed from the 
DNA : protein mixture , an anti-digoxigenin-alkaline 
phosphatase conjugated antibody was then reacted 
(Boehringer Mannheim, Indianapolis IN) with the 

20 digoxigenin-containing oligonucleotide. The antigenic 
digoxigenin moiety was recognized by the antibody-enzyme 
conjugate. The presence of dig-dUTP altered neither th 
ability of UL9-C00H protein to bind the oris SEQ ID N0:1- 
containing DNA nor the ability of streptavidin to bind the 

25 incorporated biotin. 

Captuxed DNA was detected using the alkalin 
phosphatase-conjugated antibodies to digoxigenin as 
follows . One chemilximinescent substrate" f br~ alkaline 
phosphatase is 3- (2 ' -spiroadamantane) -4-methoxy-4- (3"- 

30 phosphoryloxy) phenyl-l,2-dioxetane disodium salt (AMPPD) 
(Example 7) . Dephosphorylation of AMPPD results in an 
unstable compound, which decomposes, releasing a prolonged, 
steady mission of light at 477 nm. Light measxirement is 
very sensitive and can detect minute quantities of DNA 
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colorimetric substrates for the alkaline phosphatase 
system have also been tested and are useable in the present 
assay system. 

5 An alternative to the above biotin capture system is 

to use digbxigenin in place of biotin to modify the 
oligonucleotide at a site protected by the DNA-binding 
protein at the assay site: biotin is then used to replace 
the digoxigenin moieties in the above described detection 

10 system. In this arrangement the anti-digoxigenin antibody 
is used to capture the oligonucleotide probe when it is 
free of boiind protein. Streptavidin conjugated to alkaline 
phosphatase is then used to detect the presence of captured 
oligonucleotides . 

15 D) Alternative methods for detecting molecules that 

increase the affinity of the DNA-binding protein for its 
cognate site. 

In addition to identifying molecules or compounds that 
cause a decreased affinity of the DNA-binding protein for 

20 the screening sequence, molecules may be identified that 
increase the affinity of the protein for its cognate 
binding site. In this case, leaving the capture system for 
unbound DNA in contact with the assay for increasing 
amounts of time allows the establishment of a fixed off- 

25 rate for the DNA:protein interaction (for example SEQ ID 
N0:1/UL9). In the presence of a stabilizing molecule, the 
off -rate, as detected by^tl« captu^ _ 
will be decreased. 

Using the captvire system, for DNA: protein complexes to 

30 detect molecules that increase the affinity of the DNA- 
binding protein for the screening sequence requires that an 
excess of unlabeled oligonucleotide containing the DL9 
binding site (but not the test sequences) is added to the 
assay mixture. This is, in effect, an off-rate experiment. 
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in this case, the control sample (no test molecules or 
mixtures added) will show a fixed off -rate (ie., samples 
would be taken at fixed intervals after the addition of the 
unlabeled competition DNA molecule, applied to 
nitrocellulose, and a decreasing amount of radiolabeled 
DNAiprotein complex would be observed) . In the presence of 
a DNA-binding test molecule that enhanced the binding of 
UL9, the off-rate would be decreased (ie.,, the amount of 
radiolabeled DNA:protein complexes observed would not 
decrease as rapidly at the fixed time points as in the 
control sample) . 



III. Utility 

A. The Usefulness of Sequence-Specific DNA-Binding 

15 Molecules. 

The present invention defines a high through-put in 
vitro screening assay to test large libraries of biological 
or chemical mixtures for the presence of DNA-binding 
molecules having sequence binding preference. The assay is 
also capable of determining the sequence-specificity and 
relative affinity of known DNA-binding molecules or 
purified unScnown DNA-binding molecules. Sequence-specific 
DNA-binding molecules are of particular interest for 
several reasons, which are listed here. These reasons, in 
part, outline the rationale for determining the usefulness 
of DNA-binding molecules as therapeutic agents: 

1) Generally, for a given DNA: protein interaction, 
there- are several" thousands- fewer -target- DNA-binding 
sequences per cell than protein molecules that bind to the 
DNA. Accordingly, even fairly toxic molecules might be 
delivered in sufficiently low concentration to exert a 
biological effect by binding to the target DNA sequences. 

2) DNA has a relativ ly more well-defined structure 
compared to RNA or protein. Since the general structure of 
DNA has less t rtiary structural variation, identifying or 
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designing specific binding molecules should be ^^-^-^J ^ 
DNA than for either ENA or protein. Double-stranded DNA xs 
a repeating structure of deoxyribonucleotides that stack 
atop one another to form a linear helical structure. In 
5 this manner, DNA has a regularly repeating "lattxce- 
structure that makes it particularly amenable to molecular 
laodeling refinements and hence, drug design and 
development . 

3) Many "single-copy" genes (of which there are only 
10 1 or 2 copies in the cell) are transcribed into multiple, 
potentially thousands, of SNA molecules, each of which may 
be translated into many proteins. Accordingly, targeting 
any DNA site, whether it is a regulatory sequence or a 
coding or noncoding sequence, may require a much lower drug 
15 dose than targeting RNAs or proteins. 

proteins (e.g., enzymes, receptors, or structural 
proteins) are currently the targets of most therapeutic 
agents. More recently, ENA molecules have become the 
targets for antisense or ribozyme therapeutic molecules. 
20 4) Blocking the function of a RNA, which encodes a 

protein, or of a corresponding protein, when that protexn 
regulates several cellular genes, may have detrimental 
effects: particularly if some of the regulated genes are 
important for the survival of the cell. However, blocking 
25 a DNA-binding site that is specific to a single gene 
regulated by such a protein results in reduced toxxcxty. 
An^xa^le^ituation (4) is^HNF-lbinding to Hepatxtxs 
' B virus (IffiV) : HNF-1 binds an HBV enhancer sequence and 
stimulates transcription of HBV genes (Chang et al.). In 
30 a normal cell HNF-1 is a nuclear protein that appears to be 
important for the regulation of many genes, particularly 
liver-specific genes (Courtois et al.). If molecules were 
isolated that specifically bound to the DNA-binding domain 
of HNF-1, all of the genes regulated by HNF-1 would be 
35 down-regulated, including both viral and cellular genes. 
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such a drug could be lethal since many of the genes 
regulated by HNF-l may be necessary for liver function. 
However, the assay of the present invention presents the 
ability to screen for a molecule that could distinguish the 

5 HNF-l binding region of the Hepatitis B virus DNA fr m 
cellular HNF-l sites by, for example, including divergent 
flanking sequences when screening for the molecule. Such 
a molecule would specifically block HBV expression without 
effecting cellular gene expression. 

IQ B. General Applications of the Assay. 

General applications of the assay include but are n t 
limited to screening libraries of uncharacterized compounds 
(e.g., biological, chemical or synthetic compounds) for 
sequence-specific DNA-binding molecules (part III.B.l); 

15 determining the sequence-specificity or preference and/or 
relative affinities of DNA-binding molecules (part 
III.B.2); and testing of modified derivatives of DNA- 
binding molecules for altered specificity or affinity 
(part III.B.3). In particular, since each test compound is 

20 screened against up to 4*' sequences, where N is the number 
of basepairs in the test sequence, the method will generate 
up to 4** structure/ activity data points for analysing the 
relationship between compound struct\ire and binding 
activity, as evidenced by protein binding to an adjacent 

25 sequence. 

1) Mass-screening of libraries for the presence 
of sequence-specific DNA-binding molecules. 

Many organizations" '(eg. 7 "the National" Institutes ~f 

Health, pharmaceutical and chemical corporations) have 
30 large libraries of chemical or biological compounds from 
synthetic processes, fermentation broths or extracts that 
may contain as yet unidentified DNA-binding molecules. One 
utility of the assay of the present invention is to apply 
th assay system to the mass-screening of these libraries 
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Of different br ths, extracts, or mixtures to detect the 
specific samples that contain the DNA-binding molecules, 
once the specific mixtures that contain the DNA-binding 
molecules have been identified, the assay has a further 
usefulness in aiding in the purification of the DNA-binding 
molecule from the crude mixture. 

AS purificaiton schemes are applied to the mixture, 
the assay can be used to test the fractions f of: DNA-binding 
activity. The assay is amenable to high throughput (eg., 
a 96-well plate format automated on robotics equipment such 
as a Beckman Biomek workstation [Beckman, Palo Alto, ca] 
with detection using semiautomated plate-reading 
densitometers, luminometers, or phosphoimagers) . 

2) The assay of the present invention is also 
15 useful for screening molecules that are currently described 
in the literature as DNA-binding molecules, but which have 
uncertain DNA-binding sequence specificity (ie., having 
either no well-defined preference for binding to specific 
DNA sequences or having certain higher affinity binding 
20 sites but without defining the relative preference for all 
possible DNA binding sequences) . The assay can be used to 
determine the specific binding sites for DNA-binding 
molecules, among all possible choices of sequence that bind 
with high, low, or moderate affinity to the DNA-binding 
molecule. Actinomycin D, Distamycin A, and Doxorubicin 
(Example 6) all provide examples of molecules with these 
modes_of Jbinding^ Many _anti-cancer drugs, ^ch_ as 
Doxorubicin (see Ex^ple 6) show binding preference for 
certain identified DNA sequences, although the absolute 
highest and lowest specificity sequences have yet to be 
determined, because, until the invention described herein, 
the methods (Sales, X. and Portugal, J.; Cullinane, C. and 
Phillips, D.R.; Phillips, D.R.,; and Phillips, D.R. et al.) 
f r detecting differential affinity DNA-binding sites for 
35 any drug were limited. Doxorubicin is one of the most 
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widely used anti-cancer drugs currently available. As 
shown in Example 6, Doxorubicin is known to bind some 
sequences preferentially. Another example of such sequence 
binding preference is Daunorubicin (Chen et al.) that 
5 differs slightly in structtire from Doxorubicin (Goodman et 
al.) • Both Daunorubicin and Doxorubicin are members of the 
anthracycline antibiotic family: antibiotics in this 
family, and their derivatives, are important antitumor 
agents (Goodman et al.}. 

10 The assay of the present invention allows the sequence 

preferences or specificities of DMA-binding molecules to be 
determined. The DNA-binding molecules for which sequence 
preference or specificity can be determined may include 
small molecules such as aminoacridines and polycyclic 

15 hydrocarbons, planar dyes, various DNA-binding antibiotics 
and anticancer drugs, as well as DNA-binding macromolecules 
such as peptides and polymers that bind to nucleic acids 
(eg, DNA and the derivatized homo logs of DNA that bind t 
the DNA helix) . 

20 The molecules that can be tested in the assay for 

sequence preference/ specificity and relative affinity to 
different DNA sites include both major and minor groove 
binders as well as intercalating and non-intercalating DNA 
binders. 

25 3) The assay of the present invention facilitates the 

identification of different binding activities by molecules 
derived from known DNA-binding molecules . An example would 

be to identify^deflvatiAj^s^an^^ for 

DNA-binding activity using the assay of the present 

3 0 invention . Derivatives having DNA-binding activity are 
then tested for anti-cancer activity through, for example, 
a battery of assays performed by the National Cancer 
Institut (Bethesda MD) . Further, the assay of the present 
invention can be used to test derivatives of known anti- 

35 cancer agents to examine the effect of the modifications 
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(such as methylati n, ethylati n and other d rivatizations) 
on DNA-binding activity and specificity. The assay 
provides (i) an initial screen for the design of better 
therapeutic derivatives of known agents and (ii) a method 
5 to provide a better understanding of the mode of action of 
such therapeutic derivatives. 

4) The screening capacity of this assay is much 
greater than screening each separate DNA sequence with an 
individual cpgnate DNA-binding protein. While direct 

10 competition assays involving individual receptor rligand 
complexes (eg., a specific DNA:protein complex) are most 
commonly used for mass screening efforts, each assay 
requires the identification, isolation, purification, and 
production of the assay components. Using the assay of th 

15 present invention, libraries of synthetic chemicals or 
biological molecules can be screened for detecting 
molecules that have preferential binding to virtually any 
specified DNA sequence using a single assay system. 
Secondary screens involving the specific DNA:protein 

20 interaction may not be necessary, since inhibitory 
molecules detected in the assay may be tested directly on 
a biological system (eg., the ability to disrupt viral 
replication in a tissue culture or animal model) . 

25 C. Sequences Targeted by the Assay. 

The DNA:protein assay of the present invention has 
been designed to screen for compounds that bin d a full 
range of DNA sequences that vary in letigthr^s" well" as 
complexity. Sequence-specific DNA-binding molecules 

30 discovered by the assay have potential usefulness as either 
molecular reagents, therapeutics, or therapeutic 
precursors. Table I lists several potential specific test 
sequenc s. Sequence-specific DNA-binding molecules ar 
potentially powerful therapeutics for essentially any 

35 dis ase or condition that in some way involves DNA. 
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Examples f test s quences for the assay include: a) 
binding sequences of factors involved in the maintenance or 
propagation of infectious agents, especially viruses, 
bacteria, yeast and other fungi, b) sequences causing the 
5 inappropriate expression of certain cellular genes, and c) 
sequences involved in the replication of rapidly growing 
cells. 

Furthermore, gene expression or replication does n t 
necessarily need to be disrupted by blocking the binding of 

10 specific proteins. Specific sequences within coding 
regions of genes (e.g., oncogenes) are ecjually valid test 
sequences since the binding of small molecules to these 
sequences is likely to perturb the tremscription and/or 
replication of the region. Finally, any molecules that 

15 bind DNA with some sequence specificity, that is, not just 
to one particular test sequence, may be still be useful as 
anti-cancer agents. Several small molecules with some 
sequence preference are already in use as anticancer 
therapeutics. Molecules identified by the present assay 

20 may be particularly valuable as lead compoxinds for the 
development of congeners (i.e., chemical derivatives of a 
molecule having differenct specificities) with either 
different specificity or different affinity. 

One advantage of the present invention is that the 

25 assay is capable of screening for binding activity directed 
against any DNA sequence. Such sequences can be medically 
significant target sequences (see part 1, Medically 

Signif icant" Target^S ites i~ in ^this" section ) \~ scrambled or 

randomly generated DNA sequences, or well-defined, ordered 

30 sets of DNA sequences (see paort 2, Ordered Sets of Test 
Sequences, in this section), which could be used for 
screening for molecules demonstrating sequence preferential 
binding (like Doxorubicin) to determine the sequences with 
highest binding affinity and/ or to determine the r lative 

35 relative affinities b tween a large nxamber of different 
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sequences. There is usefulness in taking either approach 
for detecting and/or designing new therapeutic agents. 
Part 3 of this section. Theoretical Considerations for 
Choosing Target Sequences, outlines the theoretical 
5 considerations for choosing DNA target sites in a 

biological system. 

1) Medically significant target sequences. 
Few effective viral therapeutics are currently 
available; yet several potential target sequences for 

10 antiviral DNA-binding drugs have been well-characterized. 
Furthermore, with the accumulation of sequence data on all 
biological systems, including viral genomes, cellular 
genomes, pathogen genomes (bacteria, fungi, eukaryotic 
parasites, etc.), the number of target sites for DNA- 

15 binding drugs will increase greatly in the future. 
Medically significant target sites can be defined as short 
DNA sequences (approximately 4-30 base pairs) that are 
required for the expression replication of genetic 
material. For example, sequences that bind regulatory 

20 factors, either transcriptional or replicatory factors, 
would be ideal target sites for altering gene or viral 
expression. Secondly, coding sequences may be adequate 
target sites for disrupting gene function. Thirdly, even 
non-coding, non-regulatory sequences may be of interest as 

25 target sites (e.g., for disrupting replication processes or 
introducing an increased mutational frequency. Som 
specific examp]^s_of^edically sigiuficai^^ 
shown in Tadsle 1. 

30 TABTB T. MEDT^ATXV STGNIFir ANT mA-BINDTNG SEQUENCES 



|i)NA-liin^.nn)tait^ 



^Medii^-Sigiiiric^^ 



EBV origin of r^licatioa 



EBNA 



infectious mononucleosis, 
nasal phaiyngeal carcinoma 



HSV origin of rqilication 



UL9 



oral and genital Herpes 
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VZN origin of replication 


UL9-like 


shingles 


HPV origin of r^lication 


E2 


genital warts, cervical 
carcinoma 


Interieukin 2 enhancer 


NFAT-l 


mununosuppressant 


HIVLTR 


NFAT-1 
NFlcB 


ATOS, ARC 


HBV enhancer 


HNF-1 


hepatitis 


Fibrogen promoter 


HNF-1 


cardiovascular disease 


Oncogene promoter and 
coding sequences 


n 


cancer 



10 

(Abbreviations: EBV, Epstein-Barr virus; EBNA, Epstein- 
Barr virus nuclear antigen; HSV, Herpes Simplex virus; 
VZV, Vericella zoster virus; HPV, human papilloma virus; 
HIV LTR, Human immunodeficiency virus long terminal repeat; 

15 NFAT, nuclear factor of activated T cells; NFkB, nuclear 
factor kappaB; AIDS, acquired immune deficiency syndrome; 
ARC, AIDS related complex; HBV, hepatitis B virus; HNF, 
hepatic nuclear factor.) 

The origin of replication binding proteins, Epstein 

20 Barr virus nuclear antigen 1 (EBNA--1) (Ambinder, R.F., et 
al.; Reisman, D, et al.)# E2 (which is encoded by the human 
papilloma virus) (Chin, M.T., et al.)# UL9 (which is 
encoded by herpes simplex virus type 1) (McGeoch,D. J. , t 
al.) / and the homologous protein in vericella zoster virus 

25 (VZV) (Stow, N.D. and Davison, A.J.)/ have short, well- 
defined binding siJbe^s withi^ Jthe_ viral_ g.enome_ and_ are. 
therefore excellent target sites for a competitive DNA- 
binding drug. Similarly, recognition sequences for DNA- 
binding proteins that act as transcriptional regulatory 

30 factors are also good target sites for antiviral DNA- 
binding drugs. Examples include the binding site for 
hepatic nuclear factor (HNF-1) , which is required for th 
expression of human hepatitis B virus (HBV) (Chang, H.-K.) , 
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and NF.B and NFAT-1 binding sites in the human 
immunodeficiency virus (HIV) long terminal repeat (LTR),f 
one or both of which may be involved in the expression of 

the virus (Greene, W.C.)« 

Examples of non-viral DNA targets for DNA-bmdxng 
drugs are also shown in Table 1 to illustrate tixe wide 
range of potential applications for sequence-specific DNA- 
binding molecules. For example, ""^'^^^ 
activated T cells (NFAT-1) is a regulatory factor that is 
crucial to the inducible expression of the interleukin 2 
(IL.2) gene in response to signals from the antigen 
receptor, which, in turn, is required for the cascade of 
molecular events during T cell activation (for review, see 
Edwards, C.A. and Crabtree, G.R.). The mechanism of action 
of two immunosuppressants, cyclosporin A and FK506, is 
thought to be to block the inducible expression of NFAT-1 
(Schmidt,A. et al. and Banerji, S.S. et al.). However, the 
effects of these drugs are not specific to NFAT-1; 
therefore, a drug targeted specifically to the NFAT-1 
binding site in the lL-2 enhancer would be desirable as an 
improved immunosuppressant. 

Targeting the DNA site with a DNA-binding drug rather 
than targeting with a drug that affects ^^J^^'"^^ 
protein (presumably the target of the current 
25 immunosuppressants) is advantageous for at least tw 
reasons: first, there are many fewer target sites for 
specific DNA sequences than specific proteins (eg. , m the 

case oTgl^oc^lc^d rVceptSr, ^handful -Of DNA-binding 

sites vs. about 50,000 protein molecules in each cell) and 
30 secondly, only the targeted gene need be ^^^I 
binding drug, while a protein-binding drug would disable 
all the cellular functions of the protein. 

An example of the latter point is the binding site for 
HNF-1 in the human fibrinogen promoter. Fibrinogen level 
35 is one of the most highly correlated factor with 
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cardiovascular disease* A drug targeted t either HNF-1 or 
the HNF-1 binding site in the fibrinogen promoter might be 
used to decrease fibrinogen expression in individuals at 
high risk for disease because of the overexpression of 
5 fibrinogen. However^ since HNF-1 is required for th 
expression of a number of normal hepatic genes, blocking 
the HNF-1 protein would be toxic to liver function. In 
contrast, by blocking a DNA seqpience that is composed in 
part of the HNF-1 binding site and in part by flanking 

10 sequences for divergence, the fibrinogen gene can b 
targeted with a high level of selectivity, without harm t 
normal cellular HNF-l functions. 

The assay has been designed to screen virtually any 
DNA sequence. As described above, test sequences of 

15 medical significance include viral or microbial pathogen 
genomic sequences and sequences within or regulating the 
expression of oncogenes or other inappropriately expressed 
cellular genes. In addition to the detection of potential 
antiviral drugs, the assay of the present invention is also 

20 applicable to the detection of potential drugs for (i) 
disrupting the metabolism of other infectious agents, (ii) 
blocking or reducing the transcription of inappropriately 
expressed cellular genes (such as oncogenes or genes 
associated with certain genetic disorders), and (ill) the 

25 enhancement or alteration of expression of certain cellular 
genes. 

2 ) Defined sets of test sequences . 

'The^'approach described in the above section discusses 
screening leurge nxmbers of fermentation broths, extracts, 
30 or other mixtures of unknowns against specific medically 
significant DNA target sequences. The assay can also be 
utilized to screen a large number of DNA sequences against 
known DNA-binding drugs to determine th relative affinity 
of the singl drug for every possible defined specific 
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sequence. For example, there are 4° possible sequences, 
where n = the ntimber of nucleotides in the sequence. Thus, 
there are 4' = 64 different three base pair sequences, 4^ = 
256 different four base pair sequences, 4' = 1024 different 
5 5 base pair sequences, etc. If these sequences are placed 
in the test site, the site adjacent to the screening 
sequence (the example used in this invention is the UL9 
binding site) , then each of the different test sequences 
can be screened against many different DNA-binding 

10 molecules. The test sequences may be placed on either or 
both sides of the screening sequence, and the sequences 
flanking the other side of the test sequences are fixed 
sequences to stabilize the duplex and, on the 3' end of the 
top strand, to act as an annealing site for the primer (see 

15 Example 1) . For example, oligonucleotides sequences could 
be constructed as shown in Figure 15 (SEQ ID NO: 18). In 
Figure 15 the TEST and SCREENING sequences are indicated. 

The preparation of such double-stranded 
oligonucleotides is described in Example 1 and illustrated 

20 in Figure 4 A and 4B. The test sequences, denoted in Figure 
15 as X:Y (where X = A,C,G, or T and Y = the complementary 
sequence, T,G,C, or A), may be any of the 256 different 4 
base pair sequences shown in Figure 13. 

Once a set of test oligonucleotides containing all 

25 possible four base pair sequences has been synthesized (see 
Example 1), the set can be screened with any DNA-binding 

_drug The relatiYe_.-eff ect- _of-_ the _ drug. -on— each- 

oligonucleotide assay system will reflect the relative 
affinity of the drug for the test sequence. The entire 

30 spectrum of affinities for each particular DNA sequence can 
therefore be defined for any particular DNA-binding drug. 
The data generated using this approach can be used to 
facilitate molecular modeling programs and/or be used 
directly to design new DNA-binding molecules with increased 
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affinity and specificity. 

Another type of ordered set of oligonucleotides that 
may be useful for screening are sets comprised of scrambled 
sequences with fixed base composition. For example, if the 
5 recognition sequence for a protein is 5'-GATC-3' and 
libraries were to be screened for DNA-binding molecules 
that recognised this sequence, then it would be desirable 
to screen sequences of similar size and base composition as 
control sequences for the assay. The most precise 

10 experiment is one in which all possible 4 bp sequences ar 
screened; this represents 4* = 256 different test 
sequences, a nimber that may not be practical in every 
situation. However, there are many fewer possible 4 bp 
sequences with the same base composition (using the bases 

15 16, lA, IT, IC; nl = 24 different 4 bp sequences with this 
particular base composition) , which provides excellent 
controls without having to screen large niambers of 
sequences. 

3) Theoretical considerations in choosing 
20 biological target sites: Specificity and Toxicity. 

One consideration in choosing sequences to screen 
using the assay of the present invention is test sequence 
accessibility, that is, the potential exposure of the 
sequence in vivo to binding molecules. Cellular DNA is 
25 packaged in chromatin, rendering most sequences relatively 
inaccessible. Sequences that are actively transcribed, 
peurticularly those sequences that are regulatory in nature, 
areTiess protected and more accessible to both proteins and 
small molecules. This observation is substantiated by a 
30 large literature on DNAase I sensitivity, footprinting 
studies with nucleases and small molecules, and general 
studies on chromatin structure (Tullius) . The relative 
accessibility of a regulatory sequenc , as determined by 
DNAase I hypersensitivity, is likely to be several orders 
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Of magnitude greater than an inactive portion of the 
cellular genome. For this reason the regulatory sequences 
of cellular genes, as well as viral regulatory or 
replication sequences, are useful regions to choose for 
5 selecting specific inhibitory small molecules using the 
assay of the present invention. 

Another consideration in choosing sequences to be 
screened using the assay of the present invention is the 
uniqueness of the potential test sequence. As discussed 

10 above for the nuclear protein HNF-1, it is desirable that 
small inhibitory molecules are specific to their target 
with minimal cross reactivity. Both sequence composition 
emd length effect sequence uniqueness. Further, certain 
sequences are found less frequently in the human genome 

15 than in the genomes of other organisms, for example, 
mammalian viruses. Because of base composition and codon 
utilization differences, viral sequences are distinctly 
different from maimnalian sequences. As one example, the 
dinucleotide CG is found much less frequently in mammalian 

20 cells than the dinucleotide sequence 6C: further, in SV40, 
a mammalian virus, the sequences AGCT and ACGT are 
represented 34 and 0 times, respectively. Specific viral 
regulatory sequences can be chosen as test sequences 
keeping this bias in mind. Small inhibitory molecules 

25 identified which bind to such test sequences will be less 
likely to interfere with cellular functions. 

There are approximately 3 x lo' base pairs (bp) in the 
human genome.~ Of the known DNA-binding drugs for which 
there is crystallographic data, most bind 2-5 bp sequences. 

30 There are 4* = 256 different 4 base sequences; therefore, 
on average, a single 4 bp site is found roughly 1.2 x lO' 
times in the human genome. An individual 8 base site would 
b found, on average, about 50,000 times in the genome. On 
the surface, it might appear that drugs targeted at even an 
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8 bp site might be deleterious to the cell because there 
are so many binding sites; however ^ several other 
considerations must be recognized. First, most DNA is 
tightly wrapped in chromosomal proteins and is relatively 
5 inaccessible to incoming DNA-binding molecules as 
demonstrated by the nonspecific endonucleolytic digestion 
of chromatin in the nucleus (Edweurds, C.A» and Fir t el. 

Active transcription units are more accessible than 
10 DNA bound in chromosomal proteins, but the most highly 
exposed regions of DNA in chromatin are the sites that bind 
regulatory factors • As demonstrated by DNAase I 
hypersensitivity (Gross, D.S. and Garrard, W.T.)r 
regulatory sites may be 100-1000 times more sensitive t 
15 endonucleolytic attack than the bulk of chromatin. This is 
one reason for targeting regulatory sequences with DNA- 
binding drugs. Secondly, the argument that several 
anticancer drugs that bind 2, 3, or 4 bp sequences have 
sufficiently low toxicity that they can be used as drugs 
20 indicates that, if high affinity binding sites for known 
drugs can be matched with specific viral target sequences, 
it may be possible to use currently available drugs as 
antiviral agents at lower concentrations than they are 
currently used, with a concomitantly lower toxicity. 

25 

D. Using Test Matrices and Pattern Matching for th 
Analysis of Data. 

We^ssay^~de¥cribed herein designed to use a 

single DNA: protein interaction to screen for sequence- 

30 specific and sequence-preferential DNA-binding molecules 
that can recognize virtually any specified sequence. By 
using sequences flanking the recognition site for a single 
DNA: protein interaction, a very Isirg . nximber of different 
sequences can be tested. The analysis of data yielded by 

35 such experiments displayed as matrices and analyzed by 
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pattern matching techniqu s should yi Id information about 
the relatedness of DNA sequences. 

The basic principle behind the DNArprotein assay of 
the present invention is that when molecules bind DNA 
5 sequences flanking the recognition sequence for a specific 
protein the binding of that protein is blocked, 
interference with protein binding likely occurs by either 
(or both) of two mechanisms: 1) directly by steric 
hindrance, or 2) indirectly by perturbations transmitted t 

10 the recognition sequence through the DNA molecule, a typ 
of allosteric perturbation. 

Both of these mechanisms will presiimably exhibit 
distance effects. For inhibition by direct steric 
hindrance direct data for very small molecules is availabl 

15 from methylation and ethylation interference studies. 
These data suggest that for methyl and ethyl moieties, the 
steric effect is limited by distance effects to 4-5 base 
pairs. Even still the number of different sequences that 
can theoretically be tested for these very small molecules 

20 is still very large (i.e., 5 base pair combinations total 
4* (=1024) different sequences) . 

In practice, the size of sequences tested can be 
explored empirically for different sized test DNA-binding 
molecules. A wide array of sequences with increasing 

25 sequence complexity can be routinely investigated. This 
may be accomplished efficiently by synthesizing degenerate 
oligonucleotides and multiplexing oUgonucl^^ Jthe 
assay "process " (i.e., using a group of different 
oligonucleotides in a single assay) or by employing pooled 

30 sequences in test matrices. 

In view of the above, assays employing a specific 
protein and oligonucleotides containing the specific 
recognition site for that protein flanked by different 
sequences on either side of the recognition site can be 
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used to simultaneously sere n for many different molecules, 
including small molecules, that have binding preferences 
for individual sequences or families of related sequences. 
Figure 12 demonstrates how the analysis of a test matrix 
5 yields information about the nature of competitor sequence 
specificity. As an example, to screen for molecules that 
could preferentially recognize each of the 256 possible 
tetranucleotide sequences (Figure 13), oligonucleotides 
could be constructed that contain these 256 sequences 

10 immediately adjacent to a 11 bp recognition sequence of XIL9 
oris (SEQ ID NO: 15), which is identical in each construct. 

In Figure 12 "+" indicates that the mixture retards 
or blocks the formation of DNA: protein complexes in 
solution and indicates that the mixture had no marked 

15 effect on DNA: protein interactions. A summary of the 
results of the test from Figure 12 are shown in Table . 

TABLED 







#1,4,7; oligos 


none detected for the above 


#2: for recognition site 


either nonspecific or specific 


#3 


AGCT 


#5 


CATT or ATT 


#6 


CCATTC, GCATT, CATTC, GCAT, or 
ATTC 


#8 


cnr 



These results demonstrate how such a matrix provides 
data on the presence of sequence specific binding activity 
30 is a test mixture and also provides inherent controls f r 
non-specific binding. For example, the effect of test mix 
#8 on the different test assays reveals that the test mix 
preferentially affects the oligonucleotides that contain 
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the sequence CCCT. Note that th sequence does not have t 
be within the test site for test mix #8 to exert an affect. 
By displaying the data in a matrix, the analysis of the 
sequences affected by the different test mixtures is 
S facilitated. 

E) Other Applications. 

The potential pharmaceutical applications for 
sequence-specific DNA-binding molecules are broad, 
including antiviral, antifungal, antibacterial, antitumor 
10 agents, iamunosuppressants , and cardiovascular drugs, 
sequence-specific DNA-binding molecules can also be useful 
as molecular reagents as, for example, specific sequence 

probes . > 

As more molecules are detected, information about the 
15 nature of DNA-binding molecules will be gathered, 
eventually facilitating the design and/or modification of 
new molecules with different or specialized activities. 

Although the assay has been described in terms of the 
detection of sequence-specific DNA-binding molecules, the 
20 reverse assay could be achieved by adding DNA in excess to 
protein to look for peptide sequence specific protein- 
binding inhibitors. 

The following examples illustrate, but in no way are 
25 intended to limit the present invention. 

Materials and Methods 

Synthetic oligonucleotides- were prepared using 
conmercially available automated oligonucleotide synthe- 
30 sizers. Alternatively, custom designed synthetic oligo- 
nucleotides may be purchased, for example, from Synthetic 
Genetics (San Diego, CA) . Complementary strands wer 
annealed to generate double-strand oligonucleotides. 

Restriction nzymes were obtained from Boehringer 
35 Mannheim (Indianapolis IN) or New England Biolabs (Beverly 
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MA) and w r used as per the memufacturer's direct! ns. 

Distamycin A and Doxorxibicin were obtained from Sigma 
(St. Louis, MO). Actinomycin D was obtained f r m 
Boehringer Mannheim or Sigma. 

5 

px?^mpj.e 1 

Prenaration of the Oligonucleotide Containing the 
Screening seguence 

This example describes the preparation of (i) 
10 biotinylated/digoxyginin/radiolabelled, and (ii) radio- 
labelled double-stranded oligonucleotides that contain the 
screening seguence and selected Test seguences. 
A . Biot iny lat ion . 

The oligonucleotides were prepared as described above. 

15 The wild-type control seguence for the DL9 binding site, as 
obtained from HSV, is shown in Figure 4. The screening 
seguence, i.e. the UL9 binding seguence, is CGTTCGCACTT 
(SEQ ID NO:l) and is underlined in Figure 4A. Typically, 
seguences 5' and/ or 3' to the screening seguence were 

20 replaced by a selected Test seguence (Figure 5) . 

One example of the prepeuration of a site-specifically 
biotinylated oligonucleotide is outlined in Figvire 4. An 
oligonucleotide primer complementary to the 3' seguences of 
the screening seguence-containing oligonucleotide was 

25 synthesized. This oligonucleotide terminated at the 
residue corresponding to the C in position 9 of th 
screening seguence. The primer oligonucleotide was 
hybridizedntV^toe the screening 

seguence. Biotin-ll-dUTP (Bethesda Resecurch Laboratories 

30 (BRL) , Gaithersburg MD) and Klenow enzyme were added to 
this complex (Figxire 4) and the resulting partially double- 
stranded biotinylated complexes were separated from the 
unincorporated nucl otid s using ither pre-prepared G-25 
Sephadex spin columns (Pharmacia, Piscataway NJ) or 

35 "NENSORB" columns (New England Nuclear) as per 
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10 



manufacturer's instructions. The remaining single-strand 
region was converted to double-strands using DNA polymerase 
I Klenow fragment and dNTPs resulting in a fully double- 
stranded oligonucleotide. A second G-2 5 Sephadex column 
was used to purify the double-stranded oligonucleotide. 
Oligonucleotides were diluted or resuspended in 10 mM Tris- 
HCl, pH 7.5, 50 mM NaCl, and 1 mM EDTA and stored at -20''C. 
For radiolabelling the complexes, »P-alpha-dCTP (New 
England Nuclear, Wilmington, DE) replaced dCTP for the 
double-strand completion step. Alternatively, the top 
strand, the primer, or the fully double-stranded 
oligonucleotide have been radiolabeled with 7-''P-ATP and 
polynucleotide kinase (NEB, Beverly, MA) . Preliminary 
studies have employed radiolabeled, doiAle-stranded 
15 oligonucleotides. The oligonucleotides are prepared by 
radiolabeling the primer with T4 polynucleotide kinase and 
Y-=P-ATP, annealing the "top" strand full length 
oligonucleotide, and "filling-in" with Klenow fragment and 
deoxynucleotide triphosphates. After phosphorylation and 
20 second strand synthesis, oligonucleotides are separated 
from buffer and unincorporated triphosphates using G-25 
sephadex preformed spin columns (IBI or Biorad) . This 
process is outlined in Figure 4B. The reaction conditions 
for all of the above Klenow reactions were as follows: 10 
mM Tris-HCl, pH 7.5, 10 mM MgClj, 50 mM NaCl, 1 inM 
dithioerythritol, 0.33-100 /tM deoxytriphosphates, 2 units 

Klenow enzyme (Boehringer-Mannheim, Indianapolis -IN) ^The 

Klenow reactions were incubated at 25 °C for 15 minutes to 
1 hour. The polynucleotide kinase reactions were incubated 
at 27 for 30 minutes to 1 hour. 

B) End-labeling with digoxigenin. The biotinylated, 
radiolabelled oligonucleotides or radiolabeled 
oligonucleotides were isolated as above and resuspended in 
0.2 M potassium cacodylate (pH=7.2), 4 mM MgClj, 1 mM 2- 



25 



30 
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mercaptoethanol , and 0.5 mg/ml bovine serum albtmin. To 
this reaction mixture digoxigenin-ll-dUTP (an analog of 
dTTP , 2 ' -deoxy-ur idine-5 ' -triphosphate , coupled to 
digoxigenin via an ll-atom spacer arm, Boehringer Mannheim, 
5 Indianapolis IN) and terminal deoxynucleotidyl transferase 
(GIBCO BRL, Gaithersburg, MD) were added. The number of 
Dig-ll-dUTP moieties incorporated using this method 
appeared to be less than 5 (probably only 1 or 2) as judged 
by electrophoretic mobility on polyacrylamide gels of the 
10 treated fragment as compared to oligonucleotides of known 
length. 

The biotinylated or non-biotinylated, digoxygenin- 
containing, radiolabelled oligonucleotides were isolated as 
above and resuspended in 10 mM Tris-HCl, 1 mM EDTA, 50 mM 

15 NaCl, pH 7.5 for use in the binding assays. 

The above procedure can also be used to biotinylate 
the other strand by using an oligonucleotide containing the 
screening sequence complementary to the one shown in Figxxre 
4 and a primer complementary to the 3' end of that 

20 molecule. To accomplish the biotinylation Biotin-7-dATP 
was substituted for Biotin-ll-dUTP. Biotinylation was also 
accomplished by chemical synthetic methods: for example, an 
activated nucleotide is incorporated into the 
oligonucleotide and the active group is subsequently 

25 reacted with NHS-LC-Biotin (Pierce) . Other biotin 
derivatives can also be used. 

Cv. Radiolabelling the Oligonucleotides 

Generally, oligonucleotides were radiolabelled with 
gamma-'^P-ATP or alpha-^^P-deoxynucleotide triphosphates and 

30 T4 polynucleotide kinase or the Klenow fragment of DNA 
polymerase, respectively. Labelling reactions were 
perf oirmed in the buffers and by the methods recommended by 
the manufactxirers (New England Biolabs, Beverly MA; 
Bethesda Research Laboratories, Gaithersburg MD; or 



wo 93/00446 



PCr/US92/05476 



66 

Boehringer /Mannheim, Indianapolis IN) . Oligonucleotides 
were separated from buffer and unincorporated triphosphates 
using G-25 Sephadex preformed spin columns (IBI, New Haven, 
CT; or Biorad, Richmond, CA) or "NENSORB" preformed columns 
5 (New England Nuclear, Wilmington, DE) as per the 
manxifacturers instructions. 

There are several reasons to enzymatically synthesize 
the second strand. The two main reasons are that by using 
an excess of primer, second strand synthesis can be driven 

10 to near completion so that nearly all top strands are 
annealed to bottom strands, which prevents the top strand 
single strands from folding back and creating additional 
and unrelated double-stranded structures, and secondly, 
since all of the oligonucleotides are primed with a common 

15 primer, the primer can bear the end-label so that all of 
the oligonucleotides will be labeled to exactly the same 
specific activity. 

Example 2 
Preparation of the TTT.9 Protein 
20 A. Cloning of the UL9 coding sequences into pAC373. 

To express full length TJL9 protein a baculovirus . 
expression system has been used. The sequence of the UL9 
coding region of Herpes Simplex Virus has been disclosed by 
McGeoch et al. and is available as an EMBL nucleic acid 
25 sequence. The recombinant baculovirus AcNPV/ULSA, which 
contained the UL9 coding sequence, was obtained from Mark 

Challberg (National lnstitutes^f" Health ,^ Be^^ . 

The construction of this vector has been previously 
described (Olivo et al. (1988, 1989)). Briefly, the Narl/- 
30 EcoW fragment was derived from pMCieo (Wu et al.) . Blunt- 
ends were generated on this fragment by using all four 
dNTPs and the Klenow fragment of DNA polymerase I 
(Boehringer Mannheim, Indianapolis IN) to fill in the 
terminal overhangs. The resulting fragment was blunt-end 
35 ligated into the unique BamHI site of the baculoviral 
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vector pAC3T3 (Siimmers et al.)« 

B. Cloning of the UL9 coding sequence in pVL1393 
The UL9 coding region was cloned into a second 

baculovirus vector, pVL1393 (Luckow et al.)* The 3077 bp 
5 NarI/EcoR7 fragment containing the DL9 gene was excised 
from vector pEcoD (obtained from Dr. Bing Lan Rong, Eye 
Research Institute, Boston, KK) : the plasmid pEcoD 
contains a 16.2 kb EcoRI fragment derived from HSV-I that 
bears the XIL9 gene (Goldin et al.). Blunt-ends were 

10 generated on the UL9«-containing fragment as describ d 
above. EcoRI linkers (10 mer) were blunt-end ligated 
(Ausubel et al.; Sambrook et al.) to the blunt-ended Sari/- 
EcoESr fragment. 

The vector pVL1393 (Luckow et al.) was digested with 

15 EcoBI and the linearized vector isolated. This vect r 
contains 35 nucleotides of the 5' end of the coding regi n 
of the polyhedron gene upstream of the polylinker cloning 
site. The polyhedron gene ATG has been mutated to ATT to 
prevent translational initiation in recombinant clones that 

20 do not contain a coding sequence with a functional ATG. 
The EcoRI l\SL9 fragment was ligated into the linearized 
vector, the ligation mixtiure transformed into E. coli and 
ampicillin resistant clones selected. Flasmids recovered 
from the clones were analyzed by restriction digestion and 

25 plasmids carrying the insert with the amino terminal UL9 
coding sequences oriented to the 5' end of the polyhedron 

^gene— were -selected. This plasmid — ^was — ^designated 

PVL1393/UL9 (Figure 7) . 

pVL1393/UL9 was cotransf ected with wild-type 

30 baculoviral DNA (AcMNPV; Sximmers et al.) into SF9 
{Spo6optBra frugiperda) cells (Svunmers et al.). 
Recombinant baculovirus-inf ected Sf9 cells were identified 
and clonaily purified (Summers et al.). 

C. Expression of the UL9 Protein. 

35 Clonal isolates of recombinant baculovirus infected 
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Sf9 cells were grown in Grace's medium as described by 
summers et al. The cells were scraped from tissue culture 
plates and collected by centrifugation (2,000 rpm, for 5 
minutes, 4«C) . The cells were then washed once with 
5 phosphate buffered saline (PBS) (Maniatis et al.)- Cell 
pellets were frozen at -70'C. For lysis the cells were 
resuspended in 1.5 volumes 20 mM HEPES, pH 7.5, 10% 
glycerol, 1.7 M NaCl, 0.5 mM EDTA, 1 mM dithiothreitol 
(DTT), and 0.5 inM phenyl methyl sulfonyl fluoride (PMSF) . 

10 Cell lysates were cleared by ultracentrifugation (Beckman 
table top ultracentrifuge, TLS 55 rotor, 34 krpm, 1 hr, 
4»C). The supernatant was dialyzed overnight at 4»C 
against 2 liters dialysis buffer (20 mM HEPES, pH 7.5, 10% 
glycerol, 50 mM NaCl, 0.5 mM EDTA, 1 mM dtt, and 0.1 mM 

15 PMSF) . 

These partially purified extracts were prepared and 
used in DNAtprotein binding experiments. If necessary 
extracts were concentrated using a "CENTRICON 30" 
filtration device (Amicon, Danvers MA) . 

20 

D. Cloning the Truncated DL9 Protein. 
The sequence encoding the C-terminal third of DL9 and 
the 3' flanking sequences, an approximately 1.2 kb 
fragment, was subcloned into the bacterial expression 
25 vector, pGEX-2T (Figure 6) . The pGEX-2T is a modification 
of the pGEX-1 vector of Smith et al. which involved the 

insertion of -a thrombin- cleavage-sequence -in=frame with- the- 

glutathione-S-transferase protein (gst) . 

A 1,194 bp BajoHI/EcoW fragment of pEcoD was isolated 
30 that contained a 951 bp region encoding the c-terminal 317 
amino acids of UL9 and 243 bp of the 3' untranslated 
region. 

This BamHI/EcoFV UL9 carboxy-terminal (UL9-C00H) 
containing fragment was blunt-ended and EcoRI linkers added 
35 as described above. The EcoRI linkers were designed to 
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allow in-frame fusion of the UL9 coding sequence to the 
gst-thrombin coding sequences. The linkered fragment was 
isolated and digested with EcoRI. The pGEX-2T vector was 
digested with EcoRI, treated with Calf Intestinal Alkaline 
5 Phosphatase (CIP) and the linear vector isolated. The 
EcoRI linkered UL9-C00H fragment was ligated to the linear 
vector (Figure 6) . The ligation mixture was transformed 
into coli and ampicillin resistant colonies were 

selected. Plasmids were isolated from the ampicillin 

10 resistant colonies and analyzed by restriction enzyme 
digestion. A plasmid which generated a gst/thrombin/UL9- 
COOH in frame fusion was identified (Figure 6) and 
designated pGEX-2T/UL9-COOH. 

A. Expression of the Truncated UL9 Protein. 

15 E. coli strain JM109 was transformed with pGEX-2T/C- 

UL9-C00H and was grown at 37**C to saturation density 
overnight. The overnight culture was diluted 1:10 with LB 
meditam containing cunpicillin and grown from one hoxir at 
30'C. IPTG (isopropyllthi9-/3-galactoside) (GIBCO-BRL) was 

20 added to a final concentration of 0.1 mM and the incubation 
was continued for 2-5 hours. Bacterial cells containing 
the plasmid were subjected to the temperature shift and 
IPTG conditions, which induced transcription from the tac 
promoter. 

25 Cells were harvested by centrifugation and resuspended 

in 1/100 culture volxime of MTPBS (150 mM NaCl, 16 mM 

Najiff 04, -4- mM"NaH2P04) ^ -Cells- were "lysed by" sonication^ahd' 

lysates cleared of cellular debris by centrifugation. 

The fusion protein was purified over a glutathione 

30 agarose affinity colvimn as described in detail by Smith et 
al. The fusion protein was eluted from the affinity column 
with reduced glutathione, dialyzed against UL9 dialysis 
buffer (20 mM HEPES pH 7.5, 50 mM NaCl, 0.5 mM EDTA, 1 mM 
DTT, 0.1 mM PMSF) and cleaved with thrombin (2 ng/ug of 
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fusion protein) . 

An aliquot of the supernatant obtained from IPTG- 
induced cultures of pGEX-2T/C-UL9-C00H-containing cells and 
an aliquot of the affinity-purified, thrombin-cleaved 
5 protein were analyzed by SDS-polyacrylamide gel 
electrophoresis. The result of this analysis is shown in 
Figure 8. The 63 kilodalton GST/C-UL9 fusion protein is 
the largest band in the lane narked GST-UL9 (iane 2) , The 
first lane contains protein size standards. The DL9-C00H 

10 protein band (lane GST-UL9 + Thrombin, Figure 8, lane 3) is 
the band located between 30 and 46 kD: the glutathione 
transferase protein is located just below the 30 kD size 
standard. In a separate experiment a similar analysis was 
performed using the uninduced culture: it showed no 

15 protein corresponding in size to the fusion protein. 

Extracts are dialyzed before use. Also, if necessary, 
the extracts can be concentrated typically by filtration 
using a "CENTRICON 30" filter. 

20 Example 3 

Binding Assays 

A. Band shift gels. 

DNA: protein binding reactions containing both labelled 
complexes and free DNA were separated electrophoretically 

25 on 4-10% polyacrylamide/Tris-Borate-EDTA (TBE) gels (Freid 
et al.; Garner et al.). The gels were then fixed, dried, 

and exposed-to- -X-ray -f ilm.— The -autoradiograms-of- -the-gels- 

were examined for band shift patterns. 

B. Filter Binding Assays 

30 A second method used particularly in determining the 

off -rates for protein: oligonucleotide complexes is filter 
binding (Woodbury et al.). Nitrocellulose disks 
(Schleich r and Schuell, BA85 filters) that have been 
soaked in binding buffer (see below) were placed on a 

35 vacuum filter apparatus. DNA:protein binding reactions 
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(see below; typically 15-30 /il) are diluted to 0,5 ml with 
binding buffer (this dilutes the concentration of 
components without dissociating complexes) and applied to 
the discs with vacuum applied. Under low salt conditions 
5 the DNA: protein complex sticks to the filter while free DNA 
passes through. The discs are placed in scintillation 
counting fluid (New England Nuclear), and the cpm 
determined using a scintillation counter. 

This technique has been adapted to 96-well and 72-slot 

10 nitrocellulose filtration plates (Schleicher and Schuell) 
using the above protocol except (i) the reaction dilution 
and wash volumes are reduced and (ii) the flow rate through 
the filter is controlled by adjusting the vacuum pressure. 
This method greatly facilitates the nximber of assay samples 

15 that can be analyzed. Using radioactive oligonucleotides, 
the samples are applied to nitrocellulose filters, the 
filters are exposed to x-ray film, then analyzed using a 
Molecular Dynamics scanning densitometer. This system can 
transfer data directly into analytical software programs 

20 (e.g.. Excel) for analysis and graphic display. 

Example 4 
Functional UL9 Binding Assay 
A. Functional DNA-binding Activity Assay 
25 Purified protein was tested for functional activity 

using band-shift assays. Radiolabelled oligonucleotides 

(prepared— as in-- Example_lB) -that ^contain_ the___ll _bp 

recognition sequence were mixed with the UL9 protein in 
binding buffer (optimized reaction conditions: 0.1 ng ^^P- 
30 DNA, 1 ul UL9 extract, 20 mM HEPES, pH 7.2, 50 mM KCl, and 
1 mM DTT) . The reactions were incubated at room 
temperature for 10 minutes (binding occurs in less than 2 
minutes) , then separat d electrophoretically on 4-10% non- 
denaturing polyacrylamide gels. UL9-specific binding to 
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the oligonucleotide is indicated by a shift in mobility of 
the oligonucleotide on the gel in the presence of the UL9 
protein but not in its absence. Bacterial extracts 
containing (+) or without (-) UL9 protein and affinity 
5 pvirified UL9 protein were tested in the assay. Only 
bacterial extracts containing UL9 or affinity purified DL9 
protein generate the gel band-shift indicating protein 
binding. 

The degree of extract that needed to be added to the 
10 reaction mix, in order to obtain UL9 protein excess 
relative to the oligonucleotide, was empirically determined 
for each protein preparation/ extract. Aliquots of • the 
preparation were added to the reaction mix and treated as 
above. The quantity of extract at which the majority of 
15 the labelled oligonucleotide appears in the DNA: protein 
coTBplex. was evaluated by band-shift or filter binding 
assays. The assay is most sensitive under conditions in 
which the minimum amount of protein is added to bind most 
of the DNA. Excess protein can decrease the sensitivity of 

20 the sissay. 

B. Rate of Dissociation 

The rate of dissociation is determined using a 
competition assay. An oligonucleotide having the sequence 
presented in Figure 4, which contained the binding site for 
25 UL9 (SEQ ID NO: 14), was radiolabelled with «P-ATP and 
polynucleotide kinase (Bethesda Research Laboratories). 
The-competitor DNA-wasTa 1-7 base-pair-oligonucleot-ide-(SEQ 

ID NO: 16) containing the binding site for UL9. 

In the competition assays, the binding reactions 
30 (Example 4A) were assembled with each of the 
oligonucleotides and placed on ice. Unlabelled 
oligonucleotide (1 fig) was added 1, 2, 4, 6, or 21 hours 
before loading the reaction on an 8% polyacrylamide gel 
(run in TBE buffer (Maniatis et al.)) to separate the 
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reaction components. The dissociation rates, vmder these 
conditions, for the trxincated DL9 (UL9-C00H) and the full 
length UL9 is approximately 4 hours at 4^C. In addition, 
random oligonucleotides (a 10,000-fold excess) that did not 
5 contain the nL9 binding sequence and sheared herring sperm 
DNA (a 100, 000**f old excess) were tested: neither of these 
control DNAs competed for binding with the oligonucleotide 
containing the UL9 binding site. 

C* Optimization of the UL9 Binding Assay 
10 (i) Truncated UL9 from the bacterial expression 

system. 

The effects of the following components on the binding 
and dissociation rates of UL9-C00H with its cognate binding 
site have been tested and optimized: buffering conditions 

15 (including the pH, type of buffer, and concentration of 
buffer) ; the type and concentration of monovalent cation; 
the presence of divalent cations and heavy metals; 
temperatxire ; various polyvalent cations at different 
concentrations; and different redox reagents at different 

20 concentrations. The effect of a given component was 
evaluated starting with the reaction conditions given above 
and based on the dissociation reactions described in 
Example 4B. 

The optimized conditions used for the binding of UL9- 
25 COOH contained in bacterial extracts (Example 2E) to 
oligonucleotides containing the HSV ori sequence (SEQ ID 

NO: !) J/ere _as__follows:_ -20 mM HEPES pH- 7.2,50 mM KCl,— 1- 

mM DTT, 0.005 - 0.1 ng radiolabeled (specific activity, 
approximately 10* cpm/^ig) or digoxiginated^ biotinylated 
30 oligonucleotide probe, and 5-10 ng crude UL9-C00H protein 
preparation (1 mM EDTA is optional in the reaction mix). 
Under optimized conditions, UL9-C00H binds very rapidly and 
has a dissociation rate of about 4 hours at 4**C with non- 
biotinylated oligonucleotide and 5-10 minutes with 
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biotinylated oligonucleotides. The dissociation rate of 
UL9-C00H changes markedly under different physical 
conditions. Typically, the activity of a UL9 protein 
preparation was assessed using the gel band-shift assay and 
5 related to the total protein content of the extract as a 
method of standardization. The addition of herring sperm 
DNA depended on the purity of UL9 used in the experiment 
Binding assays were incubated at 25 »C for 5-30 minutes. 

(ii) Full length UL9 protein from the baculovirus 
10 system. 

The binding reaction conditions for the full length 
baculovirus-produced IJL9 polypeptide have also been 
optimized. The optimal conditions for the current assay 
were determined to be as follows: 20 mM Hepes; 100 itiM 

15 NaCl; 0.5 mM dithiothreitol; 1 mM EDTA; 5% glycerol; from 
0 to l0*-fold excess of sheared herring sperm DNA; 0:005 - 
0.1 ng radiolabeled (specific activity, approximately 10« 
cpm//ig) or digoxiginated, biotinylated oligonucleotide 
probe, and 5-10 /ig crude UL9 protein preparation. The full 

20 length protein also binds well under the optimized 
conditions established for the truncated UL9-C00H protein. 

Bxample 5 

ThtP -Rffeet of Test Sequence V ai-iation on the 
25 Off-Rate of the U T.9 Protein 

The oligonucleotides shown in Figure 5 were 
radioiabi^lled as described abovev - The competition assays- 
were performed as described in Example 4B using IIL9-C00H. 
Radiolabelled oligonucleotides were mixed with the UL9-C00H 
30 protein in binding buffer (typical reaction: 0.1 ng 
oligonucleotide ^^P-DNA, 1 Ml UL9-C00H extract, 20 mM HEPES, 
pH 7.2, 50 DM KCl, 1 mM EDTA, and 1 mM DTT) . The reactions 
were incubated at room temperature for 10 minutes. A zero 
time point sample was then taken and loaded onto an 8% 
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polyacrylamide gel (rxin use TBE) . One ng of the unlabelled 
17 bp competitive DNA oligonucleotide (SEQ ID NO: 16) 
(Example 4B) was added at 5, 10, 15, 20, or 60 minutes 
before loading the reaction sample on the gel. The results 
5 of this analysis are shown in Figure 9: the screening 
sequences that flank the UL9 binding site (SEQ ID NO:5-SEQ 
ID NO: 13} are very dissimilar but have little effect on the 
off -rate of tIL9. Accordingly, these results show that the 
tIL9 DNA binding protein is effective to bind to a screening 
10 sequence in duplex DNA with a binding affinity that is 
substantially independent of test sequences placed adjacent 
the screening sequence. Filter binding experiments gave 
the same result. 

15 B^tamp^e .6 

T ^«» Bffeet of Ac i--inQTnveir P. Distamvcin A. and 
Poxorublcin on UL9 Binding to the screening Sequence 
is Dependent on the Specific Test Sequence 
Different oligonucleotides, each of which containied 
20 the screening sequence (SEQ ID H0:1) flanked on the 5' and 
3' sides by a test sequence (SEQ ID NO: 5 to SEQ ID NO: 13), 
were evaluated for the effects of distamycin A, actinomycin 
D, and doxorubicin on UL9-C00H binding. 

Binding assays were performed as described in Example 
25 5. The oligonucleotides used in the assays are shown in 
Figure 5. The assay mixture was allowed to pre-equilibrate 
for- 15- minutes at-room-temperature prior- to the-addition-of- 

drug. 

A concentrated solution of Distamycin A was prepared 
30 in dHjO and was added to the binding reactions at the 
following concentrations: 0, 1 MM, 4 /iM, 16 ^M, and 40 mM. 
The drug was added and incubated at room temperature for 1 
hour. The reaction mixtures were then loaded on an 8% 
polyacrylamide gel (Example 5) and the components separated 
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10 



15 



electrophoretically. Autoradiographs of these gels are 
shown in Figure lOA. The test sequences tested were as 
follows: UL9 polyr, SEQ ID N0:9; UL9 CCC6, SEQ ID N0:5; 
UL9 6GGC, SEQ ID NO: 6; UL9 polyA, SEQ ID NO: 8; and DL9 
ATAT, SEQ ID NO: 7. These results demonstrate that 
Distlmycin A preferentially disrupts binding to UL9 polyT, 

UL9 polyA and UL9 ATAT. 

A concentrated solution of Aotinomycin D was prepared 
in dHjO and was added to the binding reactions at the 
following concentrations: 0 /xM and 50 /iM. The drug was 
added and incubated at room temperature for 1 hour. Equal 
volumes of dHaO were added to the control samples. The 
reaction mixtures were then loaded on an 8% polyacrylamide 
gel (Example 5) and the components separated 
electrophoretically. Autoradiographs of these gels are 
shown in Figure lOB. In addition to the test sequences 
tested above with Distamycin A, the following test 
sequences were also tested with Actinomycin D: AToril, SEQ 
ID NO:ll; oriEco2, SEQ ID N0:12, and oriEco3, SEQ ID N0:13. 
These results demonstrate that actinomycin D preferentially 
disrupts the binding of DL9 to the oligonucleotides DL9 

CCCG and UL9 GGGC. 

A concentrated solution of Doxorubicin was prepared in 
dHjO and was added to the binding reactions at the following 
25 concentrations: 0 /iM, 15 /iM and 35 mM. The drug was added 
and incubated at room temperature for 1 hour. Equal 

volumes-of-dHjO were added to -the- control-samples.- The^ 

reaction mixtures were then loaded on an 8% polyacrylamide 
gel (Example 5) and the components separat d 
electrophoretically. Autoradiographs of these gels are 
shown in Figure IOC. The same test sequences were tested 
as for Actinomycin D. These results demonstrate that 
Doxorubicin preferentially disrupts the binding of UL9 to 
the oligonucleotides UL9polyT, UL9 GGGC, oriEco2, and 



20 



30 
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oriEcoS. Doxorubicin appears to particularly disrupt th 
UL9 : screening sequence interaction when the test sequence 
oriEco3 is used. The sequences of the test sequences for 
oriEco2 and oriEco3 differ by only one base: an additional 
5 T residue inserted at position 12, compare SEQ ID NO: 12 and 
SEQ ID NO: 13. 

Example 7 

Use of the Blotin/Streptavidin Reporter System 
10 A. The Capture of Protein-Free DNA. 

Several methods have been employed to sequester 
unbound DNA from DNA: protein complexes. 

(i) Magnetic beads 

Streptavidin-conjugated superparamagnetic polystyrene 
15 beads (Dynabeads M-280 Streptavidin, Dynal AS, 6-7x10* 
beads/ml) are washed in binding buffer then used to capture 
biotinylated oligonucleotides (Example 1). The beads are 
added to a 15 ul binding reaction mixture containing 
binding buffer and biotinylated oligonucleotide. The 
20 beads/ oligonucleotide mixture is incubated for varying 
lengths of time with the binding mixture to determine the 
incubation period to maximize capttire of protein-free 
biotinylated oligonucleotides. After captxxre of the 
biotinylated oligonucleotide, the beads ceoi be retrieved by 
25 placing the reaction tubes in a magnetic rack (96-well 
plate magnets are available from Dynal). The beads are 
then- washed. — 

(ii) Agcurose beads 

Biotinylated agarose beads (immobilized D-biotin, 
30 Pierce, Rockford, IL) are bound to avidin by treating the 
beads with 50 ng/fxl avidin in binding buffer overnight at 
4«C. Th beads are washed in binding buffer and used to 
capture biotinylated DNA. The beads are mixed with binding 
mixtures to capture biotinylated DNA. The beads are 
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removed by centrifugation or by collection on a non-binding 
filter disc. 

For either of the above methods, quantification of th 
presence of the oligonucleotide depends on the method of 
5 labelling the oligonucleotide. If the oligonucleotide is 
radioactively labelled: (i) the beads and supernatant can 
be loaded onto polyacrylamide gels to separate protein: DNA 
complexes from the bead: DNA complexes by electrophoresis, 
and autoradiography performed; (ii) the beads can be placed 
10 in scintillation fluid and counted in a scintillation 
counter. Alternatively, presence of the oligonucleotide 
can be determined using a chemiluminescent or colorimetric 
detection system. 

15 B. Detection of Protein-Free DNA. 

The DNA is end-labelled with digoxigenin-ll-dUTP 
(Example 1). The antigenic digoxigenin moiety is 
recognized by an antibody-enzyme conjugate, anti- 
digoxigenin-alkaline phosphatase (Boehringer Mannheim 

20 Indianapolis IN). The DNA/ antibody-enzyme conjugate is 
then exposed to the substrate of choice. The presence of 
dig-dUTP does not alter the ability of protein to bind the 
DNA or the ability of streptavidin to bind biotin. 
(i) Chemiluminescent Detection. 

25 Digoxigenin-labelled oligonucleotides are detected 

using the chemiluminescent detection system "SOUTHERN 

LIGHTS"- developed- by Tropix, Inc— (Bedford-, - MA) . - Use-of 

this detection system is illustrated in Figures llA and 
IIB. The technique can be applied to detect DNA that has 

30 been captured on either beads or filters. 

Biotinylated oligonucleotides, which have terminal 
digoxygenin-containing residues (Example 1) , are captured 
on magnetic (Figure llA) or agarose beads (Figure IIB) as 
described above. The beads are isolated and treated to 

35 block non-specific binding by incubation with I-Light 
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blocking buffer (Tropix) for 3 0 minutes at room 
temperatxire. The presence of oligonucleotides is detected 
using alkaline phosphatase-conjugated emtibodies to 
digoxygenin . Ant i-digoxigenin-alkaline phosphatase 

5 (anti-dig-AP, 1:5000 dilution of 0.75 units/ul, Boehringer 
Mannheim) is inciibated with the sample for 30 minutes, 
decanted, and the sample washed with 100 mM Tris-HCl, pH 
7.5, 150 mM NaCl. The sample is pre-eguilibrated with 2 
washes of 50 mM sodium bicarbonate, pH 9.5, 1 H MgCl2, then 

10 incubated in the same buffer containing 0.25 mM 3-(2'- 
spiroadamantane) -4-methoxy-4- (3 ' -phosphoryloxy) phenyl-1 , 2- 
dioxetane disodium salt (AMPPD) for 5 minutes at room 
temperature. AMPPD was developed (Tropix Inc.) as a 
chemiluminescent s\ibstrate for alkaline phosphatase. Upon 

15 dephosphorylation of AMPPD the resulting compound 
decomposes, releasing a prolonged, steady emission of light 
at 477 nm. 

Excess licjuid is removed from filters and the emission 
of light occurring as a result of the dephosphorylation of 

20 AMPPD by alkaline phosphatase can be measured by exposure 
to x-ray film or by detection in a Iximinometer. 

In solution, the bead-DNA-anti-dig-AP is resuspended 
in "SOUTHERN LIGHT" assay buffer and AMPPD and measxired 
directly in a luminometer. Large scale screening assays 

25 are performed using a 96-well plate-reading luminometer 
(Dynatech Laboratories, Chantilly, VA) • Subpicogram 

quantities-of "DNA— ( 10^ to lO^ attdmoles (an attomole~is "iO'"" 

moles)) can be detected using the Tropix system in 
conjunction with the plate-reading luminometer. 

30 

(ii) Color imetric Detection. 

Standard alkaline phosphatase colorimetric substrates 
are also suitable for the above detection reactions. 
Typically substrates include 4-nitrophenyl phosphate 
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(Boehringer Mannheim) . Results of colorimetric assays can 
be evaluated in multiwell plates (as above) using a plate- 
reading spectrophotometer (Molecular Devices, Menlo Park 
CA). The use of the light emission system is more 
5 sensitive than the colorimetric systems. 

While the invention has been described with reference 
to specific methods and embodiments, it will be appreciated 
that various modifications and changes may be made without 
10 departing from the invention. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Edwards, Cynthia A. 

Cantor, Charles R* 
Andrews, Beth M. 

(ii) TITLE OP INVENTION: Screening Assay for the Detection of 
DNA-Binding Molecules 

(iii) NUMBER OF SEQUENCES: 18 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LAW OFFICES OF PETER J. DEHLINGER 

(B) STREET: P.O. Box 60850 

(C) CITY: Palo Alto 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 94306 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 



(vii) PREVIOUS APPLICATION DATA: 

(A) APPLICATION NUMBER: 07/723,618 

(B) FILING DATE: 27-JUN-1991 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Fabian, Gary R. 

(B) REGISTRATION NUMBER: 33,875 

(C) REFERENCE/DOCKET NUMBER: 4600-0075,41 
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(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 323-8302 

(B) TELEFAX: (415) 323-8306 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 baae paira 
(6) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: UL9 BINDING SITE, HSV oriS, higher 
affinity 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 
CGTTCGCACT T 

(2) INFORMATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 
(A)_LENGTH:_^11- base-pairs ^ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(vi) ORXGimL SOURCE: 

(C) INDIVIDUAL ISOLATE: UL9 BINDING SITE, HSV oriS, lower 
affinity 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 
TGCTCGCACT T 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SamCBi 

(C) INDIVIDUAL ISOLATE: UL9Z1 TEST SEQ. / UL9 ASSAY SEQ. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GCGCGCGCGC GTTCGCACTT CCGCCGCCGG 



(2 [^INFORMATION FOR SEQ ID NO; 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HXPOTHETICM.: NO 

(iv) AHTI-SENSEs NO 

(vi) ORIGINAL SOURCE x 

(C) INDIVIDUAL ISOLATE: UL9Z2 TEST SEQ. / UL9 ASSAY SEQ. 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0x4: 
GGCGCCGGCC GTTCGCACTT CGCGCGCGCG 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: - 

(C) INDIVIDUAL ISOLATE: UL9 CCCG TEST SEQ. / UL9 ASSAY SEQ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



GGCCCGCCCC GTTCGCACTT CCCGCCCCGG 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) KOLECOLE TYPEt DNA (genomic) 

(ill) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAZ. SOURCES 

(C) INDIVIDUAL ISOLATE: UL9 6GGC TEST SEQ. / 0L9 ASSAY SEQ 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GG06GGC6CC GTTC6CACTT GG6CG6GC6G 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: UL9 ATAT TEST SEQ. / UL9 ASSAY SEQ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
GGATATATAC GTTCGCACTT TAATTATTGG 
(2) INFORMATION FOR SEQ ID N0:8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDSONESS: double 
(0) TOPOLOGY: linear 

(li) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOUKCE: 

(C) INDIVIDUAL ISOLATE: UL9 polyA TEST SEQ. / DL9 ASSAY SEQ. 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 
GGAAAAAAAC GTTC6CACTT AAAAAAAAGG 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: doxible 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO . 

(yi) ORIGINAL_SOURCE: — 

(C) INDIVIDUAL ISOLATE: UL9 polyT TEST SEQ. / UL9 ASSAY 
SEQ. 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
GGTTTTTTTC GTTCGCACTT TTTTTTTTGG 
(2) INFORMATION FOR SEQ ID NO: 10: 



30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLBCDLE T7PE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: UL9 GCAC TEST SEQ. / UL9 ASSAY SEQ 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
6GAC6CACGC 6TTC6CACTT GCAGCA6CG6 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

" ~(lv) j^I -SENSE :""nO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: UL9 ATori-1 TEST SEQUENCE / UL9 
ASSAY SEQ. 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
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GCGTATATAT CGTTCGCACT TCGTCCCAAT 
(2) INFOKMAXION FOR SEQ ID NO: 12: 

(i) SEQUENCE GHARACTERISTICS: 

(A) laENGTH: 31 base pairs 

(B) 1!7PE: nucleic acid 

(C) STRANDEDNES5: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: UNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: oriEC02 TEST SEQ. / UL9 ASSAY SEQ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GGCGAATTC6 ACGTTCGCAC TTC6TCCCAA T 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: oriEC03 TEST SEQ. / UL9 ASSAY SEQ 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0xl3s 
GGCGAATTCG ATCGTTCGCA C7TCGTCCCA AT 
(2) INFOPUATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 36 bue pairs 

(B) TVPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iil) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: WILD TYPE 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO£l4: 
AAGT6AGAAT TCGAAGCGTT CGCACTTCGT CCCAAT 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(Or fOPOLOGYTlLinear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 
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(C) INDIVmrai. ISOLATE: TRUNCATED VL9 BINDING SITE, COMPARE 
SEQ ID N0:1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TTCGCACTT 

(2) INFORMATION FOR SEQ ID N0:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ill) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: HSVBl/4, SEQUENCE OF COMPETITOR DNA 

MOLECULE 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGTCGTTCGC ACTTCGC 
(2) -INFORMATION-FOR SEQ ID-NO: 17 :— \ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(ill) Hm>TRETICAL: NO 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCES 

(C) INDIVIDUAL ISOLATE: UL9 BINDING SITE, H5V oriS 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CGTTCTCACTT 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA- (genomic) 

(iii) HYPOTHETICAL: YES 

(iv) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(C) INDIVIDUAL ISOLATE: UL9 ASSAY SEQUENCE, FIGURE 15 

(xi) SEQUENCE DESCRIPTIONi SEQ ID NO: 18: 
GTCTAANNrairaraCG^ 3 7 
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IT IS CLAIMED: 

1. A method of screening for molecules capable of 
binding to a selected test sequence in a duplex DNA, 
comprising 

5 (i) adding a molecule to be screened to a test system 

composed of (a) a DNA binding protein which is effective to 
bind to a screening sequence in a duplex DNA with a binding 
affinity that is substantially independent of said test 
sequence adjacent the screening sequence, but where said 

10 protein binding is sensitive to binding of molecules to 
such test sequence, and (b) a duplex DNA having said 
screening and test sequences adjacent one another, wherein 
the binding protein is present in molar excess over the 
screening sequence present in the duplex DNA, 

15 (ii) incubating the molecule in the test system for a 

period sufficient to permit binding of the compound being 
tested to the test sequence in the duplex DNA, and 

(iii) detecting the amount of binding protein bound to 
the duplex DNA before and after said adding. 



20 



2. The method of claim 1, wherein the screening 
sequence/binding protein is selected from the group 
consisting of EBV origin of replication/EBNA, HSV origin of 
replication/DL9, VZV origin of replication/DL9-like, and 
25 HPV origin of replication/E2 , and lambda Oj,-Oji/cro. 

__3. The method-of-claim -2, wherein the -DNA_screening- 

sequence is from the HSV origin of replication and the 
binding protein is UL9. 



30 



4. The method of claim 3, wherein the DNA screening 
sequence is selected from the group consisting of SEQ ID 
NOll, SEQ ID N0:2, SEQ ID N0:15, and SEQ ID NO: 17. 
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S. The meth d of claim 1, wherein said detecting is 
accomplished using either a gel band-shift assay or a 
filter-binding assay. 

5 6. The method of claim 1, wherein the test sequences 

are selected from the group consisting of EBV origin of 
replication, HSV origin of replication, VZV origin of 
replication, HPV origin of replication, interleukin 2 
enhancer, HIV-LTR, HBV enhancer, and fibrinogen promoter. 

10 

7. The method of claim 1, where the test secpiences 
are selected from a defined set of nucleic acid sequences. 

8. The method of claim 7, wherein said defined set of 
is DNA sequences has [X^]*^ combinations, where is sequence 

of deoxyribonucleotides and the number of 
deoxyribonucleotides in each sequence is N, N is greater 
than or ec[ual to three. 

20 9. The method of claim 8, wherein N is 3-20. 

10. The method of claim 9, wherein N is 4-10. 

11. The method of claim 10, wherein N is 4 and the 
25 number of combinations is 256. 

12. _ThB_ method _of_claim- -1 , -where in - said - detecting 

includes the use of a captxare system that traps DNA free of 
bound protein. 

30 

13. The method of claim 12, wherein the capture 
system involves the biotinylation of a nucleotide within 
the screening s quence (i) that does not eliminate th 
protein's ability to bind to the screening sequence, (ii) 
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that is capable of binding streptavidin, and (iii) wherein 
the biotin moiety is protected from interactions with 
streptavidin when the protein is bound to the screening 
sequence. 

5 

14. The method of claim 1, wherein said binding 
protein is present in a molsir concentration less than or 
equal to the molar concentration of the screening sequence 
present in the duplex DNA. 

10 

15. The method of claim 1 , wherein said defined set 
of nucleic acid sequences are all possible sequential 
combinations of a number of deoxyribonucleotides , N, 
wherein N is less than 20 and more than 2. 

15 

16. The method of claim 15, wherein N is less than 10 
euid more than 2. . 

17. The method of claim 16, wherein N is 4. 

20 

18 . A screening system for identifying molecules that 
are capable of binding to a test sequence in a target 
duplex DNA sequence, comprising 

a duplex DNA having screening emd test sequences 
25 adjacent one another, 

a DNA binding protein that is effective in binding to 

said- screening-sequence in-the-duplex-DNA-with -a- binding 

affinity that is substantially independent of said test 
sequence adjacent the screening sequence, but which is 
30 sensitive to binding of molecules to said test sequence, 
wherein the binding protein is present in molar excess over 
the screening sequence present in the duplex DNA, and means 
for detecting the amount of binding protein bound to the 
DNA. 



35 
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19, The system of claim 18, wherein the test 
sequences are selected from the group consisting of EBV 
origin of replication, HSV origin of replication, VZV 
origin of replication, HPV origin of replication, 
inter leukin 2 enhancer, HIV-LTR, HBV enhancer, and 
fibrinogen promoter. 

20. The system of claim 18, where the test sequences 
are selected from a defined set of nucleic acid sec[uences« 



21. The system of claim 20, wherein said defined set 
of DNA sequences has [ X^] ^ combinations , where Xn is 
sequence of deoxyribonucleotides and the number of 
deoxyribonucleotides in each sequence is N, N is greater 

15 than or equal to three. 

22. The system of claim 21, where said 
deoxyribonucleotides are selected from the group consisting 
of deoxyr iboadenosine , deoxyriboguanosine, 

20 deoxyr ibocytidine, and deoxyr ibothymidine. 

23. The system of claim 21, wherein N is 3-20. 

24. The system of claim 23, wherein N is 4-10. 

25 

25. The system of claim 24, wherein N is 4 and the 
nximber of-combinations~is" 256T 

26. The system of claim 18, where the screening 
30 sequence/binding protein is selected from the group 

consisting of EBV origin of replication/EBNA, HSV origin of 
replication/UL9, VZV rigin of replication/UL9-like, and 
HPV origin f replication/E2, and lambda o^-0;j/cro. 
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27. The system of claim 26, wherein the DMA screening 
sequence is from the HSV origin of replication and the 
binding protein is DL9. 

5 28. The system of claim 27, wherein the DNA screening 

sequence is selected from the group consisting of SEQ ID 
NO:l, SEQ ID NO: 2, SEQ ID NO: 15 and SEQ ID NO: 17. 

29. The system of claim 28, where the DNA screening 
10 sequence is SEQ ID N0:1. 

30. The system of claim 29, where the U residue in 
position 8 is biotinylated. 

15 31. The system of claim 30, where said detection 

means includes streptavidin, and the streptavidin is bound 
to a solid support. 

32. The system of claim 31, where streptavidin is 
20 used to capture the duplex DNA when it is free of bound 

protein. 

33. A method for inhibiting the binding of a DNA- 
binding protein to duplex DNA, comprising 

25 contacting a compound with a duplex DNA which contains 

a test sequence adjacent a screening sequence, where the 

DNA binding" protein 'is~eff ective~to~bind~to~the"^ 

sequence with a binding affinity that is substantially 
independent of said test sequence, further where the 

30 binding of said compound to the test sequence inhibits the 
binding of the protein to the screening sequence. 

34. The method of claim 33, wherein the compound is 
identified by the steps of 

35 preparing a series of duplex nucleic acid fragments. 
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each containing a test sequence composed of one of the 4N 
possible permutations of sec[uences in a sequence of base 
pairs having N-basepairs, where said test sequence is 
adjacent the screening secpience, 
5 measuring the binding affinity of the DNA binding 

protein to each of the series of nucleic acid fragments in 
the presence of the compound, and 

selecting the compound if it lovers the binding 
affinity of the DNA binding protein for the screening 
10 sequence. 
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GAAGCA6GGTTA 5 

+ BIOTIN-ll-dDTP 
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5' 2iAGTGAGaATTCG2UlGCGTTCGC&CTTCGTCCCAAT 3* 

UGAAGCAGGGTTA 5 
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PDRIFY, THEN M3D 
dNTFS + KLENOW 



5' AAGTGAGAATTCGAAGCGTTCGC21CTTCGTCCC2VA.T 3' 
3« TTCACTCTTAAGCTTCGCaAGCGnGAAGCAGGGTTA 5' 
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DIG-ll-dUTP + 
S/ TERMINAL TRANSFERl^E 
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Screening 

Test Sequence: Sequence: Test Sequence 



UL9 Z 1 5 ' -GCGCGC5CGCGTTCGCACTTCCGCCGCCGG-3 ' 

Z-DNA 

UL9 Z2 5 ' -GGCGCCGGCCGTTeGgRgP TCSCGCGCGCG -3 ' 

Z-ONA 

UL9 CCCG 5 ' H3GCCCGCCCCGTTCGCACrTCCCGCCCCGG-3 ' 

X7L9 GGGC 5 ' -GGCGGGC6CCGTTCGCACTT6GGCGGGCGG-3 ' 

UL9 ATAT 5 ' -G6ATATATACGTTCGCACTTTAATTATTGG-3 ' 
UL9 polyA 5 ' -GGAAAAAAACGTTCGCACTXAAAAAAAAGG-3 ' 
UL9 polyT 5 * H3GlTTTTlU'CGTTCGak ClTlTiTl ' TTT GG-3 * 
UL9 GCAC 5 ' -GGACGCACGCGTTCGCACTTGCAGCAGCGG-3 ' 

ATor i- 1 5 ' -GCGTATATATCGTTCGCACTTCGTCCCAAT-3 ' 

oriEco2 5 ' -GGCGAATTCGACGTTCGCACTTCGTCCC2kAT-3 ' 



oriEco3 5 ' -GGCGAATTCGATCGTTCGCACTTCGTCCCAAT-3 ' 

Fig. 5 
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GATC AGTC TAGC CGAT 

GACT A6CT TAC6 C6TA 

6TCA ATCG TGCA CATG 

GTAC ATGC TGAC CAGT 

6CTA ACTG TCA6 CTAG 

GCAT ACGT TCGA CTGA 



Fig. 14 
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3 • -C6CATTY7YYGCAAGC6T6AAYYYY6AA6CAG66TTA-5 * 
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